php - Large CSV file import to mysql, best practice -


looking insight on best approach large csv file imports mysql , managing dataset. ecommerce storefront "startup". product data read csv files download via curl (server server).

each csv file represents different supplier/warehouse 100,000 products. in total there 1.2 million products spread on 90-100 suppliers. @ least 75% of row data (51 columns) redundant garbage , not needed.

would better use mysqli load data local infile 'temp_products' table. then, make needed data adjustments per row, insert live 'products' table or use fgetcsv() , go row row? import handled cronjob using sites php.ini memory limit of 128m.

  • apache v2.2.29
  • php v5.4.43
  • mysql v5.5.42-37.1-log
  • memory_limit 128m

i'm not looking "how to's". i'm looking "best approach" communities perspective , experience.

i have direct experience of doing virtually identical describe -- lots of third party data sources in different formats needing go single master table.

i needed take different approaches different data sources, because in xml, in csv, large, small, etc. large csv ones, did indeed follow suggested routed:

  • i used load data infile dump raw contents temporary table.
  • i took opportunity transform or discard of data within query; load data infile allows quite complex queries. allowed me use same temp table several of import processes though had quite different csv data, made next step easier.
  • i used set of secondary sql queries pull temp data various main tables. told, had 7 steps process.

i had set of php classes imports, implemented common interface. meant have common front-end program run of importers.

since lot of importers did similar tasks, put commonly used code in traits code shared.

some thoughts based on things said in question:

  • load data infile orders of magnitude quicker fgetcsv() php loop.
  • load data infile queries can complex , achieve data mapping without ever having run other code, long imported data going single table.
  • your memory limit need raised. however, using load data infile means mysql use memory, not php, php limit won't come play that. 128m still low though. -if struggle import whole thing in 1 go, try using simple linux shell commands split file several smaller chunks. csv data format should make simple.

Comments

Popular posts from this blog

dns - How To Use Custom Nameserver On Free Cloudflare? -

python - Pygame screen.blit not working -

c# - Web API response xml language -