Import and format data to create and evaluate a machine learning model.
- Create ML
Use a table to store your training data when you want to use a standard JSON or CSV file to create a model, or when you want to train a model from custom types you've created in your own code. Each row contains an example of the data you're training the model to classify.
At the point in your code where you use the data to train a model, you select a column to be the target of the machine learning model's predictions. The remaining columns contain the features you provide to the model to make a prediction. Figure 1 illustrates how to structure training data for a classifier that will predict a book’s genre.
When you create an
MLData from a CSV or JSON file, the Create ML framework translates the structure of your input data directly into tabular data.
Import JSON Data
To create a data table from JSON data, you use the
init(contents initializer to create a row from each dictionary in the root JSON array. The names of the keys in each dictionary are used as the names of the columns in the table.
Import Tabular Data
MLData can import data from an in-memory dictionary or a CSV file. A CSV or comma-separated values file is a textual representation of a table. You can create a CSV file programmatically or use an app like Numbers to export a spreadsheet. These formats directly translate into a data table as rows and columns.
For example, the
init(dictionary:) initializer uses the keys in the provided dictionary as column names. The value for each column key is an array of the values for that column. You can use an
MLData to represent a column of values, or any type that conforms to the
When you import a CSV file with the
init(contents initializer, it creates a row in the table from each line in the CSV file.