TabularData

Import, organize, and prepare a table of data to train a machine learning model.

TabularData Documentation

Post

Replies

Boosts

Views

Activity

Export and use Create ML model in runtime

Hello! In my iPhone app, I have created a button so that I can retrain my CreateML Linear Regression: let model = try MLLinearRegressor.init(trainingData: trainingData.base, targetColumn: "Target") Now I want to use this model in the UI. I know that I can use the model above directly, like following: let pred = try model.predictions(from: testData) However, I prefer using .mlmodel files, because I want to receive one prediction at a time, instead of having to input a DataFrame. I have seen examples where people use the model.write-method to store the .mlmodel on the computer desktop for example, but I want to do it in runtime in the app. Is this possible? And in that case, how can I store the .mlfile and load it again? Can I do it in the app repository?

1.3k

Oct ’22

How to get a DataFrame of first n elements and last n elements from a sorted DataFrame

I have the following CSV Data which is loaded into a DataFrame: import Foundation import TabularData func main() { let csv = """ Text,Value Frog,47 Duck,11 Horse, 11 Bee, 12 Spider,55 Flower,1 Tree,100 """ var df = try! DataFrame(csvData: Data(csv.utf8)) df = df.sorted(on: "Value", order: .descending) print(df) /*Prints ┏━━━┳━━━━━━━━━━┳━━━━━━━━━━┓ ┃ ┃ Text ┃ Value ┃ ┃ ┃ <String> ┃ <Double> ┃ ┡━━━╇━━━━━━━━━━╇━━━━━━━━━━┩ │ 0 │ Tree │ 100,0 │ │ 1 │ Spider │ 55,0 │ │ 2 │ Frog │ 47,0 │ │ 3 │ Bee │ 12,0 │ │ 4 │ Duck │ 11,0. │ │ 5 │ Horse │ 11,0 │ │ 6 │ Flower │ 1,0 │ └───┴──────────┴──────────┘ */ } main() I want, for example, only the first two and last two elements from the DataFrame above: ┏━━━┳━━━━━━━━━━┳━━━━━━━━━━┓ ┃ ┃ Text ┃ Value ┃ ┃ ┃ <String> ┃ <Double> ┃ ┡━━━╇━━━━━━━━━━╇━━━━━━━━━━┩ │ 0 │ Tree │ 100,0 │ │ 1 │ Spider │ 55,0 │ │ 2 │ Horse │ 11,0 │ │ 3 │ Flower │ 1,0 │ └───┴──────────┴──────────┘

Programming Languages Swift Swift TabularData

1.1k

Sep ’22

Dataframe.addAlias not working as expected.

I'm trying to add aliases for columns in a data frame. But the function does not seem to do anything. DF.addAlias("None", forColumn: col.name) print(DF.columnNames(forAlias: "None")) This never prints anything. Any ideas? What am I doing wrong?

Programming Languages Swift Swift TabularData

887

Sep ’22

There is skiprow option in MLDataTable framework and I am wondering what would be the equivalent in Swift TabularData framework ?

I was working in this csv file and header in located in 4th row so how to skip 3 rows in TabularData framework. Note that skiprow option in available in MLDataTable framework as well as in Python's Panda import Foundation import TabularData let options = CSVReadingOptions( hasHeaderRow: false, nilEncodings: ["","nil"], ignoresEmptyLines: true ) let dataPath = " https://www2.census.gov/programs-surveys/saipe/datasets/time-series/model-tables/irs.csv" var dataFrame = try! DataFrame(contentsOfCSVFile: URL(string: dataPath)!, rows: 0..<15, options: options) print (dataFrame.description)

Machine Learning & AI Create ML Create ML TabularData

1.4k

Sep ’22

TabularData Framework: DataFrame Headers as a List in SwiftUI

I can't figure out how to display a the headers for a DataFrame in SwiftUI. According to this post, I am only able to display rows in SwiftUI. Or am I completely wrong? Is there a way to convert the headers from a DataFrame to an String Array [String]? e.g., dataframe.columns.map { col in col.name }

UI Frameworks SwiftUI Swift Frameworks SwiftUI TabularData

1.5k

Sep ’22

Ask an Expert: Does TabularData do much of Python's Pandas Framework

I'm looking for a way to easily (or more easy than rewriting a time series data framework) deal with stock market data. I apparently need to preprocess much of the data I could get from typical APIs (Finnhub.io, AlphaVantage.co) to remove the weekend days from the datasets. Problem: When using the awesome NEW Charts framework to plot prices by daily close price - I get weekends and holidays in my charts. No "real" stock charting tool does this. They some how remove the non-market days from their charts. How? Researching I found the Python Pandas library for TimeSeries data... Can Apple's TabularData do this TimeSeries data manipulation for me? Can to share an example? Thanks! David

Machine Learning & AI General Machine Learning Create ML TabularData

1.3k

Aug ’22

func jsonRepresentation(options: JSONWritingOptions = .init()) throws -> Data returns an UnsafeRawPointer

In TabularData.DataFrame, using myDataFrame.jsonRepresentation() returns an UnsafeRawPointer Is there a workaround for this please?

Machine Learning & AI General TabularData

1.5k

Jul ’22

Dataframe columns are ordered alphabetically when you print the dataframe

`import Cocoa import TabularData let greeting = "Decimal Type Evaluation" let JSON = """ [{ "product": "Apple", "type": "Fruit", "weight": 7.5, "unit_price": 0.34 }, { "product": "Pear", "type": "Fruit", "weight": 0.5, "unit_price": 0.25 }] """ struct Product: Decodable { let product: String let type: String let weight: Double let unit_price: Double } let jsonData = JSON.data(using: .utf8)! let products: [Product] = try! JSONDecoder().decode([Product].self, from: jsonData) var dataframe = try! DataFrame(jsonData: jsonData) print (dataframe)` This results in the output below showing the columns are now ordered alphabetically and not in the order they appear in the array struct definition. ┏━━━┳━━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━┓ ┃ ┃ product ┃ type ┃ unit_price ┃ weight ┃ ┃ ┃ <String> ┃ <String> ┃ <Double> ┃ <Double> ┃ ┡━━━╇━━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━┩ │ 0 │ Apple │ Fruit │ 0.34 │ 7.5 │ │ 1 │ Pear │ Fruit │ 0.25 │ 0.5 │ └───┴──────────┴──────────┴────────────┴──────────┘ 2 rows, 4 columns

Machine Learning & AI General TabularData

1.2k

Jul ’22

TabularData Framework: crash on opening CSV file

Hi, I am getting a crash report from a user, where they get an application crash when they open a CSV file on their device. I use the standard DataFrame(contentsOfCSVFile: fileURL, options: options) initializer to create a DataFrame, but that's where it's crashing, even though it's inside a try-catch block: public func loadInitialCSVData(withURL fileURL: URL) throws -> DataFrame { let options = CSVReadingOptions(hasHeaderRow: true, delimiter: ",") do { let dataFrame = try DataFrame(contentsOfCSVFile: fileURL, options: options) } catch { // log error here - doesn't get here } This is from the crash report: Exception Type: SIGTRAP Exception Codes: TRAP_BRKPT at 0x21e02be38 Crashed Thread: 0 Thread 0 Crashed: 0 TabularData 0x000000021e02be38 __swift_project_boxed_opaque_existential_1 + 9488 1 TabularData 0x000000021e099d64 __swift_memcpy17_8 + 4612 2 TabularData 0x000000021e099958 __swift_memcpy17_8 + 3576 3 TabularData 0x000000021e09935c __swift_memcpy17_8 + 2044 4 Contacts Journal CRM 0x000000010433f614 Contacts_Journal_CRM.CJCSVHeaderMapper.loadInitialCSVData(withURL: Foundation.URL) throws -> TabularData.DataFrame (CJCSVHeaderMapper.swift:26) 5 Contacts Journal CRM 0x00000001043009d8 (extension in Contacts_Journal_CRM):__C.MacContactsViewController.handleSelectedCSVFileForURL(selectedURL: Foundation.URL) -> () (MacContactsViewControllerExtension.swift:28) 6 Contacts Journal CRM 0x0000000104301e64 @objc (extension in Contacts_Journal_CRM):__C.MacContactsViewController.handleSelectedCSVFileForURL(selectedURL: Foundation.URL) -> () (<compiler-generated>:0) 7 Contacts Journal CRM 0x0000000104222c94 __51-[MacContactsViewController importCSVFileSelected:]_block_invoke (MacContactsViewController.m:954) 8 AppKit 0x00000001bbe8f294 -[NSSavePanel didEndPanelWithReturnCode:] + 84` I can't diagnose the crash, because it doesn't have more information. I don't have access to the CSV file currently either, so I don't know what else I can do to prevent it. What could possibly be causing this crash? Does it not matter that I am also trying to catch the errors it's throwing, or can the app crash because of some internal reasons with the framework?

Machine Learning & AI General TabularData

3.4k

Jun ’22

reading ldoor matrix and incomplete factorization?

I tried to read in the ldoor matrix and attempted the LLT factorization but it gives me: "parseLdoor[55178:5595352] Factored does not hold a completed matrix factorization. (lldb)" Because the ldoor matrix is large I have not been able to discover the issue. I am unsure if the matrix data was converted correctly via the SparseConvertFromCoordinate function. Otoh, I was able to use the same code to get the correct answers for the simple 4x4 example used in the Sparse Solver documentation. Any help would be appreciated. Here is my code ... without the ldoor matrix

App & System Services Core OS Accelerate TabularData

1.1k

Apr ’22

MLDataTable > Column Type "Int" when it should be "Double"

When importing a CSV file with ~ 50 columns and ~200 rows the "MLDataTable(contentsOf: inputDataPath, options: parsingOptions)" command has issues parsing. Much of the data has "0" in the fields but sporadically there are decimal values. If I have a column where I have say 180 "0"s and the last 20 columns have decimal values the column identifies as an "Int" and lines are dropped during the parsing process. Is there a way provide Column Type Hints? Is there a way to force a column type? Is MLDataTable only looking at a handful of rows and determining the column type?

Machine Learning & AI Core ML Core ML TabularData

1.1k

Feb ’22

How to get the output of a for-in loop into a single DataFrame instead of many Dataframes

I am trying to get the data for each track in a MusicItemCollection into a DataFrame. The only way I know to go about this is to use a for-in loop, but this creates a DataFrame for each track instead of a DataFrame for all of the tracks. My code is: let albumTracks = album?.tracks for tracks in albumTracks { let dataFrame: DataFrame = [ "track": [tracks.trackNumber], "title": [tracks.title], "artist": [tracks.artistName], "release date": [tracks.releaseDate?.formatted(date: .long, time: .omitted) ?? "not available"], "duration": [tracks.duration], "isrc": [tracks.isrc] ] print(dataFrame) I have also tried for tracks in self.albumTracks { var dataFrame = DataFrame.init() let trackNumColumn = Column.init(name: "track", contents: [String(tracks.trackNumber!)]) dataFrame.append(column: trackNumColumn) let titleColumn = Column.init(name: "title", contents: [tracks.title]) dataFrame.append(column: titleColumn) let artistColumn = Column.init(name: "artist", contents: [tracks.artistName]) dataFrame.append(column: artistColumn) let releaseDateColumn = Column.init(name: "release date", contents: [tracks.releaseDate?.formatted(date: .long, time: .omitted)]) dataFrame.append(column: releaseDateColumn) let idColumn = Column.init(name: "id", contents: [String(tracks.id.rawValue)]) dataFrame.append(column: idColumn) print(dataFrame) Both of these methods work outside of a loop according to the tech talks video: https://developer.apple.com/videos/play/tech-talks/10100/ , but I cannot figure out how to call the individual tracks outside of a loop Thanks

UI Frameworks SwiftUI MusicKit SwiftUI TabularData

1.8k

Jan ’22

Export and use Create ML model in runtime