EPF files bad formatted

So... after cracking my head up for hours, having contacted EPF support team and still waiting answer, I'm going to expose the problems I'm having with EPF (enterprise partner feed) although I cannot understand how there's no one complaining already.


First I want to say, I'm using the Apple EPF tool, with some modifications like using the 'mysql.connector' instead of the 'mysqldb'. Although the problem resides in EPF files, not the importing tool.


EPF files are .tbz (tar bz2) that when uncompressed turn into plain text files. They have some kind of headers preceeded by '#' which explain column names and types, and foreign keys. Then, all the rows.

Columns and rows are separated by special chars (all of this is explained in EPF documentation):

  • Field Separator (FS): SOH (ASCII character 1)
  • Record Separator (RS) : STX (ASCII character 2) + “n”


Well, so the problem is that some rows have the field separator (FS) char when it shouldn't be there, so it splits one column into two, moving the content of other columns into incorrect ones and making mysql throw an error because of bad type (apart from losing the data of the last column because the epf tool strips excess columns when parsing). At least I have found this problem in the application file, I'm still trying to solve this mess before going to the next files.


Is anyone having these problems also?

EPF files bad formatted
 
 
Q