Text and location based clustering for twitter data - Repost - open to bidding

I have 6 tweet-datasets each about an event. I want someone to do the following tasks on them.

1. First step is Pre-processing of the data (URL removal, stopword removal, slang removal, POS tagging, duplicate removal, Get geo-coordinates(For this I will provide some help) and spelling correction)

2. Second step is: Cluster the tweets around the most common topics(topic clustering). Assign the topic to each tweet, Store tweets with topic information in a dataframe for step3. This should be unsupervised.

3. Geo cluster the tweets and if most of the tweets in the cluster of some radius are specific to one topic(out of 6 topics as given to every tweet in step 2) then change the topic of this tweet from the one given in step 2 to the most common topic of the geo-cluster.

4. As the dataset is labelled so finally evaluate the given system with a performance matrix for each dataset(event).

Time: 5 days

Habilidades: Ciencia de datos, Hadoop, Map Reduce, Python, Spark

Ver más:

Información del empleador:
( 0 comentarios ) kulti, India

Nº del proyecto: #12748643

11 freelancers están ofertando el promedio de $185 para este trabajo


I am a data scientist and have experience with machine learning and statistical analysis of data using R and Python. I also have experience with BIg data tech such as Spark and Hadoop. I would like to do the project.

$140 USD en 3 días
(19 comentarios)
$155 USD en 3 días
(29 comentarios)

4 years of experience in data [login to view URL] science and analytics professional with excellent coding skills in R and Python .  - Proficient in R, Python, SQL,Matlab ; Hands on experience with VBA & Tableau - Stati Más

$300 USD en 3 días
(15 comentarios)

Hi I am a very experienced statistician and academic writer. I have completed several PhD level thesis projects involving advanced statistical analysis of data. I have worked with data from several companies and have d Más

$250 USD en 3 días
(8 comentarios)

Hi there! I have read what you exactly need, however I would like to ask you a few questions. I wouldn't call myself a master but I do work smart and do not rest until I get the job done. Please feel free to ping me an Más

$155 USD en 3 días
(1 comentario)
$111 USD en 3 días
(17 comentarios)

Hello, I am applying for this job because I've a keen interest in BigData technology and I’m considering your job post for me with the required capabilities. I have learned Hadoop and its components Hive, Pig, Sqoop, Más

$89 USD en 6 días
(7 comentarios)
$277 USD en 10 días
(4 comentarios)

Expertise in machine learning and python programming .Has lot of interest in natural language processing.

$155 USD en 3 días
(0 comentarios)

I have experience in twitter data mining, tweet preprocessing, tweet clustering and tweet classification when helping my college in his dissertation.

$250 USD en 7 días
(0 comentarios)

A proposal has not yet been provided

$155 USD en 3 días
(0 comentarios)