data parsing

Completado Publicado Dec 10, 2011 Pagado a la entrega
Completado Pagado a la entrega

Hi Drobyshev,

I said I would come back with more things do to. Here´s the first :-)

Please answer to this message FIRST. I was not able to attach files.....

It has to do with structured data again, that has to be presented in a certain way. But this time it is not from a website.

Let me explain.

There are 3 files which are interrelated. I attached examples of the three:

1. [url removed, login to view]

a. ContentID = The Ad Copy ID

b. ContentName = The name given to the Ad Copy

c. ContentType = The file format associated to the Ad Copy in question.

d. Last Modified = The time and date the ad copy was last modified.

2. [url removed, login to view] (I reduced the number of entries, because this normally ways about 20 to 50 Megabyes.

Playlog files contain information for each time an ad copy played on your network. Information provided will resemble the following:

"4197888 2008-04-15 12:13:00 30 679811 1 4901122 4882767 4882765"

In this example, the information provided is the

I. Player id (4197888),

II. timestamp when content playback ended (2008-04-15 12:13:00),

III. how long it played in seconds (30),

IV. the ad copy id itself (679811),

V. the number of screens connected to the Player (1),

VI. the Campaign id (4901122),

VII. the Frame id (4882767)

VIII. and the Display Unit ID (4882765).

3. [url removed, login to view]

They can all be found on:

[url removed, login to view]

User name: assomatv_multimedia

Password: assm1214

So NOW the task.

The task to be achieved is to produce the file “[url removed, login to view]”(also attached) by parsing the relevant data of [url removed, login to view] and [url removed, login to view]

What I think will be the difficult part is that:

1. [url removed, login to view] is so big and therefor difficult to acces;

2. [url removed, login to view] contains info for one day, but “[url removed, login to view]” has to have aggregate date for up to a month (30 x 20-50 Mbs of info to be extracted, to thereafter be presented in the desired way). So, if we are on the 7th day of a month, it will have aggregated all the info for the first 7 days.

1. In “reproductions” you see that the sheets are called “location one”, “location two”, “location three”. This is [player id (I)]. So, the first step is to be able to manually give a logical name to player id.

2. Column B is “content name”, which is the tag [<ContentName>] in [url removed, login to view]

3. Column C and D are begin period and end period, which should be deducted from [timestamp when content playback ended (2008-04-15 12:13:00 (II)]

4. Column D is the number of time a certain [<ContentName>] has been reproduced over the aggregate period of days. The Content Name is represented by the Content ID. The Content ID (=the ad copy id itself; IV) in [url removed, login to view] represents a ContentName, the legend of which (which content ID ID (=the ad copy id itself; IV) = which ContentName?) can be found in in [url removed, login to view]

5. Then colums F to AC are just break downs of hours, which van be deducted from timestamp [when content playback ended (2008-04-15 12:13:00 II))]

That is basically it. What do you think about it?

Thank you.

ATTACHMENT ADDED WITH ALL MENTIONED DATA FILES (THESE ARE JUST EXAMPLES)

Procesamiento de datos

Nº del proyecto: #1335026

Sobre el proyecto

3 propuestas Proyecto remoto Activo Dec 12, 2011

Adjudicado a:

drobyshev

Thank you for invitation. I send you PM.

$80 USD en 2 días
(3 comentarios)
4.1

3 freelancers están ofertando un promedio de $117 por este trabajo

cheungkc

Hi My assumption and proposal is detailed in PM please

$150 USD en 7 días
(6 comentarios)
2.9
kamachi2009

Sir, we are ready to work on this. please check for PMB

$120 USD en 3 días
(0 comentarios)
0.0