Text and location based clustering for twitter data - Repost - open to bidding

Este proyecto recibió 12 ofertas de freelancers talentosos con una oferta promedio de $183 USD.

Obtén cotizaciones gratis para un proyecto como este
Presupuesto de Proyecto
$30 - $250 USD
Ofertas Totales
Descripción del Proyecto

I have 6 tweet-datasets each about an event. I want someone to do the following tasks on them.

1. First step is Pre-processing of the data (URL removal, stopword removal, slang removal, POS tagging, duplicate removal, Get geo-coordinates(For this I will provide some help) and spelling correction)

2. Second step is: Cluster the tweets around the most common topics(topic clustering). Assign the topic to each tweet, Store tweets with topic information in a dataframe for step3. This should be unsupervised.

3. Geo cluster the tweets and if most of the tweets in the cluster of some radius are specific to one topic(out of 6 topics as given to every tweet in step 2) then change the topic of this tweet from the one given in step 2 to the most common topic of the geo-cluster.

4. As the dataset is labelled so finally evaluate the given system with a performance matrix for each dataset(event).

Time: 5 days

Habilidades Requeridas

Buscando hacer algo de dinero?

  • Establece tu presupuesto y período de tiempo
  • Describe tu propuesta
  • Consigue pago por tu trabajo

Contrata Freelancers que también oferten en este proyecto

    • Forbes
    • The New York Times
    • Time
    • Wall Street Journal
    • Times Online