MLRecommender training data error

Question

Created Jul ’20

Replies 6

Boosts 0

Views 1.1k

Participants 3

I have such training data

Code Block X															 user_id															 item_id	score
0	d1d1dad9-af15-4bc6-9066-5bc39a830eb0	10034e91-1698-4f16-9cc2-483aa2e84372			1
1	d1d1dad9-af15-4bc6-9066-5bc39a830eb0	22ca0dc8-1607-4f48-bef3-84a267607cf5			1
2	d1d1dad9-af15-4bc6-9066-5bc39a830eb0	396a1d47-ca8f-4189-8526-85e40875c363		 35
3	d1d1dad9-af15-4bc6-9066-5bc39a830eb0	4827fb22-4bd1-47f8-aee4-fb45b8900cb6			1
4	f69df0fd-3b7e-489d-9197-28d94be3d281	53fb60b1-7d6c-473f-91bf-42fd670ae055			6
5	8730655f-b7b9-4d36-a4c2-f48e866e4533	53fb60b1-7d6c-473f-91bf-42fd670ae055			1
6	d1d1dad9-af15-4bc6-9066-5bc39a830eb0	53fb60b1-7d6c-473f-91bf-42fd670ae055			1
7	9155b83d-d443-46d8-a24d-cff329eb0d07	73bfd56b-b799-43cd-b17b-4ef259d18fcc		 35
8	d1d1dad9-af15-4bc6-9066-5bc39a830eb0	be9d114d-2a16-4e24-b142-44fc97351cc6			1
9	d1d1dad9-af15-4bc6-9066-5bc39a830eb0	d8ab8d67-7efa-4370-b7d0-6d176c81901f			1

and while trying to train MLRecommender model I get such error:

Item IDs in the recommender model must be numbered 0, 1, ..., numitems - 1

when I do not use score column as rating model is trained, but this score data is very important.

I tried to mock with "dummy" userid which scores 0 to every itemids and "dummy" itemid which is scored by all user_ids but this did not fixed model training error.

How then train my model with above data?

Boost

Answer 1

PawelMadej OP

Jul ’20

here are source files. original data and mocked normalised

https://gist.github.com/nysander/4a37db1abde1bfa4b58706ca2d1ae27e

2

Answer 2

Developer Tools Engineer OP

Apple

Jul ’20

Thanks for supplying the data, that's really going to help us.

Looks like this feedback is yours: FB7854032

0

Answer 3

PawelMadej OP

Jul ’20

yes. that is my feedback

Update:

This error is the same for CreateML app bundled with Xcode 11.5 and 12-beta

1

Answer 4

Developer Tools Engineer OP

Apple

Jul ’20

Hi Paweł. Thanks for the sample data, it was very helpful, and I was able to confirm the issue existed on macOS Catalina.

I was able to train using your data without an error using MacOS Big Sur beta-2 and Xcode 12 beta-2.

0

Answer 5

OP

Apple

Jul ’20

Hi Pawel, Thank you for taking the time to reach out. A potential workaround to try for this issue on macOS Catalina is to add a dummy user that has rated all items and a dummy item that has been rated by all the users -- if you are able to update your OS to the latest macOS 11, this issue should be addressed.

0

Answer 6

PawelMadej OP

Jul ’20

Accepted Answer

Hello, as added to Feedback I have used normalised data with dummy ratings of 0.0 and it also didn't worked at all

I have been talking with other person who had similar problems and according to this post:

https://stackoverflow.com/posts/comments/110994450?noredirect=1

I have tested my dummy ratings with values bigger than 0 (used 0.1) and model was trained for the first time with ratings enabled.

So data has to be:

normalized (every user rates every item)
dummy ratings has to be higher than 0

I've created pandas python script for data normalisation and I'm really happy to have it working

Please update docs with this informations if possible so people trying to use MLRecommender has direct knowledge what they should do to make it working.

0