Hello friends, we recently ran into some problems will full dump of EPF feed dated 20211007.
I'm wondering if anyone has any suggestions or solutions for these.
itunes/collection file
The collection file seems to have "lost" its column types?
The collection_id column has changed from BIGINT
in previous file, to VARCHAR(1000)
in latest.
Similarly the media_type_id was INTEGER
and is also now VARCHAR(1000)
, and several datetime fields changed to varchar. And even the longer varchar columns for 4000 chars became 1000 chars.
For reference the file header is now reporting the following:
#export_date collection_id name title_version search_terms parental_advisory_id artist_display_name view_url artwork_url original_release_date itunes_release_date label_studio content_provider_name copyright p_line media_type_id is_compilation collection_type_id
#primaryKey:collection_id
#dbTypes:BIGINT VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) INTEGER VARCHAR(1000)
#exportMode:FULL
Comparing with the previous full dump 20210722
#export_date collection_id name title_version search_terms parental_advisory_id artist_display_name view_url artwork_url original_release_date itunes_release_date label_studio content_provider_name copyright p_line media_type_id is_compilation collection_type_id
#primaryKey:collection_id
#dbTypes:BIGINT BIGINT VARCHAR(1000) VARCHAR(1000) VARCHAR(3000) INTEGER VARCHAR(1000) VARCHAR(1000) VARCHAR(1000) DATETIME DATETIME VARCHAR(1000) VARCHAR(1000) VARCHAR(4000) VARCHAR(4000) INTEGER BOOLEAN INTEGER
#exportMode:FULL
itunes/artist_collection file
This file the columns are still unchanged, so it's a different error. In this case the file contents are not consistent wit the primaryKey constraint in the file.
eg. It's reported the pkey is the tuple (artist_id,collection_id,role_id)
.
However, when importing from EPFimporter tool it gives many errors with the latest data because there are duplicate rows.
For example the first error I see is for the following entries (which I extracted manually). The problem is the rows are identical except for the "is_primary_artist" value.
export_date artist_id collection_id is_primary_artist role_id
1633587189 36270 1461423948 1 1
1633587189 36270 1461423948 0 1
The first error I think I can handle by forcing the column type to be same as before. But the second the data itself has problems so I'm not sure what to do for it.
Judging by some of the older posts on this forum I'm not sure there will be any reply, but thanks for looking all the same.