How to use glances + influxDB + Grafana for deploy a monitor

Hello, today will discuss how to use glances + influxDB + Grafana for deploy a cluster monitor used for collect your servers status, such as CPU disk network performance thoughput so on.
The environments based on amazon EC2 Ubuntu desktop
Glances is a monitor plugin developed by python used to monitor your linux system performances and hardware heath
please make sure you have installed Python.27 environment before the installation, here I used anaconda environment
pip install glances
pip install –upgrade glances
1
find the glances.conf file for configration connect to influxdb
execution glances make sure it works.
InfluxDB is a opensource database developed by Go . The majority used to store time-series data, such as CPU status per seconds. so on
sudo dpkg -i influxdb_1.0.2_amd64.deb
sudo /etc/init.d/influxdb start
sudo service influxdb start
access your influxdDB by browser localhost:8083
2
the glances access your database by port 8086
CREATE DATABASE “glances”
CREATE USER “root” WITH PASSWORD ‘root’ WITH ALL PRIVILEGES
after you configuration all stuff properly then use
glances –export-influxdb  to import data store to influxdb
Try to put your glances.conf file in the ~/.config/glances folder (create it if it did not exist).

Grafana

Grafana is an open source, feature rich metrics dashboard and graph editor for Graphite, Elasticsearch, OpenTSDB, Prometheus and InfluxDB.

Installation steps below

sudo apt-get install -y adduser libfontconfig
sudo dpkg -i grafana_3.1.1-1470047149_amd64.deb
# … configure your options in /etc/grafana/grafana.ini then
sudo systemctl start grafana-server
$ wget https://grafanarel.s3.amazonaws.com/builds/grafana_3.1.1-1470047149_amd64.deb
$ sudo apt-get install -y adduser libfontconfig
$ sudo dpkg -i grafana_3.1.1-1470047149_amd64.deb
$ sudo service grafana-server start
access your web UI
glances –export-influxdb
sudo service influxdb start
sudo service grafana-server start
03

 

Advertisements

Cassandra architecture based on DHT

Today we will discuss the bottom layer of the Cassandra, separated to 5 parts.

What the Cassandra architecture is ?

How nodes communicate in the cluster?

How to nodes reach to eventual consistency ?

How to increase R/W performance?

1.What the Cassandra architecture is ?

At the traditional internet architecture, majority used centralized database, such as early oracle, mssql along with the big data explosive growth, its confront with big challenge, the centralize data database is unable to undertake the traffic. this is the decentralized data center evolves.

d29c24726c41c286bc8bf65f10e6a6b5_b

If you have been experience Distributed Hash Table, you mush be familiar with Chord, the Cassandra design model mostly from Chord, P2P nodes/ Gossip based.

The core part in distributed systems , Partition. Which is partition our data source to multiple part and storage to different nodes, the popular partition method include

Range partitioning

List partitioning

Hash partitioning

 

873d61fa9d0778509772cc7105f835cb_r

Compare the range partitioning and list partitioning the has partition can trade off more balance, can guarantee each nodes store some amount of data.

1

The hash function can fix the balance at initial status, but consider the distributed system is scalable and dynamic, if we scalable some nodes, which means we have to re-hash the data it increased the complexity of  data migrate.

2

34

To fix nodes increase lead the re-hash problem, the Chord could used, of course this is not exactly Chord principle ,some parts the Cassandra did the optimization and customized.

6

During the above structure, the DB4,DB3,DB2,DB1 maintain list of datas, the tricky problem is each nodes unbalanced. Because in the P2P network each nodes could be server for processing the clients request, but in real case its need computation the hash value, and point to the nodes which maintains the hash lists.

By the way the traditional hash function is pretty slowly, instead the Cassandra used MurmurHash to replaced traditional hash function.

2.How nodes communicate in the cluster?

During the P2P network, the SWIM and Gossip protocol is pretty popular, the Cassandra used Gossip for communication between nodes.

7

In Cassandra each nodes maintains a neighbor list, when the nodes received a message it will broadcast to the neighbor list.  The fan number and TTL used for control message delivery times.

08

The example above, the new nodes DB5 insert the P2P network, the DB5 first message to DB1 and DB3, because DB4 is neighbor list at DB3, so the DB4 will direct receive message from DB3.

shit

 

 

 

 

The database model we convert to MD5

09

Cassandra used circle structure solved the nodes unbalance problem , but after we convert the primary key to hash. The node still unbalance, that is because hash function still can make hash value uneven, the virtual nodes and move nodes could used for solve this.

The virtual nodes notion is one nodes present multiple nodes, see below.

The DB1 and DB2, they got different data sets, but instead if we create some multiple virtual machine , the possibility will be increase.

11

 

 

article continues.

References

http://salsahpc.indiana.edu/b534projects/sites/default/files/public/1_Cassandra_Gala,%20Dhairya%20Mahendra.pdf

http://salsahpc.indiana.edu/b534projects/sites/default/files/public/1_Cassandra_Shankar,%20Vaibhav.pdf

 

Revisiting the Python GIL

When you programming using Python, the most popular words you hear is GIL(global interpreter locker), the GIL always limited the Python performance such as the multiple threads,concurrent, parallelism  so on. Today i like to share what i understanding of the GIL hope it will be helpful.

img_1320

What the GIL is ?
The GIL problem not caused by Python, instead the problem with CPython. The notion like C++ we know that are programming language, but it could be compile by different compiler such as VC++,INTEL C++ etc. The Python code can be compile by CPython or JPython, fortunately is the JPython don’t exist GIL problem. Because most system used CPython as the default environment ,it makes confused to Python caused the GIL. We have to remember GIL is the CPython problem.

 

Official explain is :

In CPython, the global interpreter lock, or GIL, is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython’s memory management is not thread-safe. (However, since the GIL exists, other features have grown to depend on the guarantees that it enforces.

Seems terrible?  instead if we execute some code for testing the multiple threads see what will happen.

from threading import Thread
import time

def my_counter():
 i = 0
 for _ in range(100000000):
 i = i + 1
 return True

def main():
 thread_array = {}
 start_time = time.time()
 for tid in range(2):
 t = Thread(target=my_counter)
 t.start()
 t.join()
 end_time = time.time()
 print("Total time: {}".format(end_time - start_time))

if __name__ == '__main__':
 main()
from threading import Thread
import time

def my_counter():
 i = 0
 for _ in range(100000000):
 i = i + 1
 return True

def main():
 thread_array = {}
 start_time = time.time()
 for tid in range(2):
 t = Thread(target=my_counter)
 t.start()
 thread_array[tid] = t
 for i in range(2):
 thread_array[i].join()
 end_time = time.time()
 print("Total time: {}".format(end_time - start_time))

if __name__ == '__main__':
 main()

we write a short counter for counting to 1 billion ,we compare the performance between the single thread and multiple threads.
The results below, the single thread take 36 seconds and the multiple threads take 60 seconds. The multiple threads is much slower than the single thread.
Screen Shot 2016-08-07 at 11.53.04 PM

We knows how bad the GIL is but how  can we  avoid this problem?

Use multiprocessing instead Thread could relatively solve the problem,multiprocessing using processing to instead the reads, each processing has own GIL, it could avoid mutual effectively.

 

References

https://mail.python.org/pipermail/python-dev/2009-October/093321.html

http://www.dabeaz.com/python/UnderstandingGIL.pdf

 

 

The app store implementation using Java (1)

This project provides an introduction to full-stack web app development using JAVA,JS,MongoDB and Python. Whether you plan to dive deeper into web app development or you plan to integrate other work with web applications, a basic foundation in full stack development will be very helpful at your future work.

This project has separate to 3 part, the program framework include, data crawler, recommend system and front-end.

The crawler used to collect apps information for initialize our database contents, because at the beginning we don’t have things our database, if you didn’t like to implement the crawler by your self, you can download the database at end of the article .

Recommender system used to retrieve data from database and used collaborative algorithm and cousin-similarity to computation the similarity between apps and  push the best apps for our users.

The Front-end part will use the MVC_Spring + Java + hibernate and Javascript.

Screen Shot 2016-08-04 at 7.15.11 PM

We used python implement a mini Crowler to mining data from

http://appstore.huawei.com/

The app basic information store as div label at the page, we only need to obtain the appid, name,description is enough.Screen Shot 2016-08-04 at 7.37.52 PM

import scrapy
import re
from scrapy.selector import Selector
from huaweistore.items import HuaweistoreItem

class ExampleSpider(scrapy.Spider):
 name = "huawei"
 allowed_domains = ["huawei.com"]

 start_urls = [
 "http://appstore.huawei.com/more/all"
 ]

 def parse(self, response):
 page = Selector(response)

 #divs = page.xpath('//div[@class="game-info whole"]')

 hrefs = page.xpath('//h4[@class="title"]/a/@href')
 #hrefs = page.xpath('//div[@class="page-ctrl ctrl-app"]/a/@href')

 for href in hrefs:
 #print href
 url = href.extract()
 yield scrapy.Request(url, callback=self.parse_item)

 def parse_item(self,response):
 page = Selector(response)
 item = HuaweistoreItem()

 item['title'] = page.xpath('//ul[@class="app-info-ul nofloat"]/li/p/span[@class="title"]/text()').extract_first().encode('utf-8')
 item['url'] = response.url
 item['appid'] = re.match(r'http://.*/(.*)', item['url']).group(1)
 item['intro'] = page.xpath('//meta[@name="description"]/@content').extract_first().encode('utf-8')
 #item['thumbnail'] = page.xpath('//div[@class="app-info flt"]/ul/li[@class="img"]/img[@class="app-ico"]/@lazyload').extract_first().encode('utf-8')

 divs = page.xpath('//div[@class="open-info"]')
 recomm = ""
 for div in divs:
 url = div.xpath('./p[@class="name"]/a/@href').extract_first()
 recommended_appid = re.match(r'http://.*/(.*)',url).group(1)
 name = div.xpath('./p[@class="name"]/a/text()').extract_first().encode('utf-8')
 recomm += "{0}:{1},".format(recommended_appid,name)
 item['recommended'] = recomm

 yield item

After we executed the crawler , data has been saved in our local disk, the next step we only need use our database for import those datas.

To simplify an app store, we only need appId, icon, description, and similarity apps is enough to deploy a web, because we dont have strong server , we do not implement the app download function.

Screen Shot 2016-08-04 at 7.51.29 PM

The Crowler Performance 100 pages/seconds, also there have many factors such as bandwidth , here we have serious problem is that. During our frequency access the webpage, our request may will be block by the server. To solve this problem i suggest to replace your proxy IP address, or you could random the user-agent.

The fake-useragent will help you    https://pypi.python.org/pypi/fake-useragent

 


Recommender system design.

After, we got the initialize data,  some how we need show more apps for our users, the recommendation system in pretty important in the app store system. If users search some specific apps , we need recommend some similarity apps for the user. Example if user search facebook app, we also need recommend twitter app, to implement that collaborative filtering and cousin-similarity will be used.

For more details, please access

https://en.wikipedia.org/wiki/Collaborative_filtering

https://en.wikipedia.org/wiki/Cosine_similarity

We have multiple elements in our lists, the similarity = how many elements has been intersected.

we can see userlist1 and userlist 2 intersaction has a3,a5,a7.

Screen Shot 2016-08-06 at 6.13.19 PM

The possibility is key factor will influence the similarity, if we attention the list u1, we can see the result is 0.5, that is because dilutive.

Screen Shot 2016-08-06 at 4.26.29 PM

import operator
from dataservice import DataService
import math

class Helper(object):
@classmethod
def cosine_similarity(cls, app_list1, app_list2):
match_count = cls.__count_match(app_list1, app_list2)
return float(match_count) / math.sqrt( len(app_list1) * len(app_list2))

@classmethod
def __count_match(cls, list1, list2):
count = 0
for element in list1:
if element in list2:
count += 1
return count

def calculate_top_5(app, user_download_history):
'''
cosine_similarity between an App and user's history
'''
#create a dict to store each other app and its similarity to this app
app_similarity = {}#{app_id: similarity}
for apps in user_download_history:
#calculate the similarity
similarity = Helper.cosine_similarity([app], apps)
# accumluate similarity
for other_app in apps:
if app_similarity.has_key(other_app):
app_similarity[other_app] = app_similarity[other_app] + similarity
else:
app_similarity[other_app] = similarity

# There could be app without related apps (not in any download history)
if not app_similarity.has_key(app):
return

#sort app_similarity dict by value and get the top 5 as recommendation
app_similarity.pop(app)
sorted_tups = sorted(app_similarity.items(), key=operator.itemgetter(1), reverse=True)#sort by similarity
top_5_app = [sorted_tups[0][0], sorted_tups[1][0], sorted_tups[2][0], sorted_tups[3][0], sorted_tups[4][0]]
#print("top_5_app for " + str(app) + ":\t" + str(top_5_app))

#store the top 5
DataService.update_app_info({'app_id': app}, {'$set': {'top_5_app': top_5_app}})

Its pretty easy to implement a simple recommendation system, but we only finished the first step.

 

 

The simplest way to fix VMware “failed to lock the file cannot open the disk ” problem

Today when i start my Vmware system, the system warning

 

Failed to lock the file

Cannot open the disk ‘/media/remy/WD My Book/Laptop-Remy/Laptop-Remy.vmdk’ or one of the snapshot disks it depends on.

Module DiskEarly power on failed.

Failed to start the virtual machine.

 

The problem caused is because you are missing some snapchat files , I take few hours to got the solution, but mostly the solution is pretty complex need to recompile the configuration file, actually the simplest solution is

  1. Go to the VMware mane click the Power on to the firmware
  2. You will jump to the VMware BIOS system.
  3. Now you need re-start your Vmware system.then your problem solved.