php - Large CSV file import to mysql, best practice -
looking insight on best approach large csv file imports mysql , managing dataset. ecommerce storefront "startup". product data read csv files download via curl (server server).
each csv file represents different supplier/warehouse 100,000 products. in total there 1.2 million products spread on 90-100 suppliers. @ least 75% of row data (51 columns) redundant garbage , not needed.
would better use mysqli load data local infile 'temp_products' table. then, make needed data adjustments per row, insert live 'products' table or use fgetcsv() , go row row? import handled cronjob using sites php.ini memory limit of 128m.
- apache v2.2.29
- php v5.4.43
- mysql v5.5.42-37.1-log
- memory_limit 128m
i'm not looking "how to's". i'm looking "best approach" communities perspective , experience.
i have direct experience of doing virtually identical describe -- lots of third party data sources in different formats needing go single master table.
i needed take different approaches different data sources, because in xml, in csv, large, small, etc. large csv ones, did indeed follow suggested routed:
- i used
load data infile
dump raw contents temporary table. - i took opportunity transform or discard of data within query;
load data infile
allows quite complex queries. allowed me use same temp table several of import processes though had quite different csv data, made next step easier. - i used set of secondary sql queries pull temp data various main tables. told, had 7 steps process.
i had set of php classes imports, implemented common interface. meant have common front-end program run of importers.
since lot of importers did similar tasks, put commonly used code in traits code shared.
some thoughts based on things said in question:
load data infile
orders of magnitude quickerfgetcsv()
php loop.load data infile
queries can complex , achieve data mapping without ever having run other code, long imported data going single table.- your memory limit need raised. however, using
load data infile
means mysql use memory, not php, php limit won't come play that. 128m still low though. -if struggle import whole thing in 1 go, try using simple linux shell commands split file several smaller chunks. csv data format should make simple.
Comments
Post a Comment