I need aggregated data in CSV or Stata .dta format from the Medicare Physician Fee Schedule provided by the Dept. of Health and Human Services for all years and treatments. Unfortunately the freely available data on the website is not in a format that can be aggregated easily. See here for an example:
[login to view URL]
There are several drop-down forms that must be selected and posted to open a csv download dialog box. I do not have the programming skills to automate the process and am willing to pay for this to be done. The reason this project needs to be automated is that there are 16 different time periods and HCPCS (treatment) codes that range from 00000 to 99999. The kicker is that the only way to get the data is by specifying HCPCS codes singly, in groups of five, or in a specified range as long as the number of resulting rows does not exceed 10,000. Therefore the algorithm is probably going to have to be adaptive in that it should detect the "too many results" error and adjust the HCPCS range accordingly.
Aside from the year and HCPCS codes, the remaining options are constant.
Type of info: all
HCPCS Criteria: range of HCPCS codes
Carrier/MAC Option: All Carriers/MACs
Modifier: All Modifiers
Project Requirements:
At minimum I'd like source code in Python or Perl that I can run myself. The code should download the CSV files from the website for all years and HCPCS codes from 00000 to 99999 and rename the downloaded CSV file to something logical.
Beyond the minimum, I would prefer if the bidder merges the CSV data either by year or into one large dataset by appending an extra variable for the year. File size may be a consideration. I can provide FTP access if needed. If you have experience with STATA let me know and we can work out extra compensation for a complete and labeled dta file.
Hi, you said you need Perl/ Python script, can I use PHP for instead? I don't know Python/Perl but I'm good in PHP. I'm a form scraper (even with CAPTCHA). I can do it with PHP and save a file in csv format, Thanks.