Time-series clustering of jobs performance behaviors using (spark/ Hadoop )
$10-30 USD
Pagado a la entrega
I am working on a huge number of timeseries of jobs performance behaviors
the goal of my work is to cluster a large number of jobs performance behaviors saved as timeseries and identifying the optimal number of clustering using K-means and Hierarchical and Dynamic Time Warping and compare between the three techniques using Pyspark
I need someone who has good experience in Pyspark especially the dataframe API to do the following tasks:
1- data preprocessing
2- dimensionality reduction
3- apply clustering algorithm (K-means, Hierarchical and Dynamic Time Warping)
4- identify the optimal number of clustering
5- compare between the three techniques
you need to be familiar with Hadoop ecosystem ,spark, Yarn
Nº del proyecto: #22356884
Sobre el proyecto
9 freelancers están ofertando un promedio de $69 por este trabajo
Hi I am a very experienced statistician, data scientist and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several comp Más
do kindly reach out to me over chat and we can get started on the task. Also, if there are additional information you can share with me in a zip file
Hello!I I am very interested in your post project. I am really looking for this kind of project for a long time in freelancer since i have rich experience on it. I think this project is very suitable for me and i am su Más
Hi I have 3 years experience in Big Data eco system. spark hdfs yarn flume kafka HBase . Please reach to discuss further.. Thanks
Hi, this is Clark from Shanghai, China. I have more than two years of spark and scala application development experience. I have worked in paypal, google crop as a full time employee and graduated from chinese top3 uni Más