Find Jobs
Hire Freelancers

Crawler combination(rebuild)

$250-750 USD

Cerrado
Publicado hace casi 8 años

$250-750 USD

Pagado a la entrega
We need some one to combine two of our crawlers. Crawler A: It can scrape web site, remove the web code, find the absolute path of link, store picture and resource and store the content of the data into MongoDB. And the data will be in a tree-structure, just like use F12 to check the elements of the web page. And this crawler allows us to import a file of website. But it crawl very slowly, because it use chrome drive to crawl. Crawler B: It can crawl really fast, but it can only write the data to a file with all the web code. So basically, we want to combine them. For crawling, we want to use Crawler B's speed. But for other function. We want to use Crawler A's, especially for the data storage in MongoDB. PLZ provide your previous experience(sample or demo), a better crawler framework will be very welcome
ID del proyecto: 10707115

Información sobre el proyecto

9 propuestas
Proyecto remoto
Activo hace 8 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
9 freelancers están ofertando un promedio de $525 USD por este trabajo
Avatar del usuario
Dear Sir, I am TOP RANKED programmer with 10 years of experience. I can merge both crawlers and create a fast one. Send me code.
$555 USD en 15 días
4,8 (464 comentarios)
7,5
7,5
Avatar del usuario
I guess: the first crawler use Selenium framework, right? The program will open browser window (as you mentioned, use Chrome Driver), the program then wait the browser window render the web pages complete, after that it scrape data the second crawler use HTTP request directly, so it will be quick, but HTTP request can only get the original source of the pages, it cannot run javascript to render the page. It's impossible to combine the 2 aspects directly, but there is another way to speed up. That is use multi-threads, to use multi-threads, the tasks must could be split into sub tasks, such as you have 10000 pages to scrape, you can put into 10 threads, each thread 1000 pages.
$555 USD en 10 días
5,0 (44 comentarios)
6,3
6,3
Avatar del usuario
Hi mate, I have a lot of experience with parsing and extracting links and elements from text. Combining the two crawlers should be a routine task for several days. Just contact me to discuss the details and the project will be a breeze.
$350 USD en 7 días
5,0 (2 comentarios)
4,4
4,4
Avatar del usuario
We have very good experience in developing web crawlers and website automation scripts in .NET and have done several similar projects in past. You can see our reviews and satisfaction level of our clients for such projects. Please share website from where you want to get the data in your DB and we will prepare and send a sample to you so that you will be 100% sure that we can d you work. Please message us soon as we are ready to start today.
$850 USD en 10 días
5,0 (4 comentarios)
4,0
4,0
Avatar del usuario
Hi, I'm a software developer with 5 years experience. I have created many scrapers and used all the good frameworks, including Selenium, Jsoup and HttpClient. My last scraping project involved downloading hundreds of thousands of shopping products from Ezbuy and storing the info inside a CSV file. I can re-examine your problems with both scrapers & tasks then create a superior scraper that better fits your needs. This project would take from 1-2 weeks. Anyway if you're interested, PM me. Sincerely, Owen McMonagle. Software Eureka.
$750 USD en 10 días
5,0 (3 comentarios)
3,3
3,3

Sobre este cliente

Bandera de CHINA
上海, China
5,0
45
Forma de pago verificada
Miembro desde dic 9, 2015

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.