array.append too slow

Question

Created Jul ’19

Replies 13

Boosts 0

Views 2.3k

Participants 5

I need to build an array that has 22 columns in it and 6000 rows. The bottleneck here is it takes about 500 seconds to build it. I have tracked it to this directly. I read a file to get this data. Does anyone have any ideas on speeding this up? More information can be provided on request.

ccDataStr_StrArr.append(itemHolder ?? [])

Answered by OOPer in 372366022

It is not clear where is TAB in your newly shown text example, but is looks like a Tab-Separated Value where some items can contain line breaks when enclosed in double-quotes.

If you are working such data as I guess, counting 22 is not a good way to handle it.

You can try something like this:

import Foundation

let tsvText = """
VFA122_EF\tPE3\tFA-18E\tAMAH\t165897\t3BTVMMF\tPE3336110\t12/1/2016 11:01:42.71\t4/11/2017 09:00:43.63\t020\t84 DAY SPECIAL CORROSION COMPLY WITH 84 DAY SPECIAL CORROSION INSPECT(ATFLIR) INSPECTION ON [ATFLIR, E/F AN/ASQ-228(v)2 - FRP403] IAW AW-228AC-MRC-300 SC 0 000\t030000F\t09355\t12/1/2016 11:01:59.85\t"COMPLIED WITH 84 DAY SPECIAL INSPECTION: 120-PO2 C BLAIS-12/27/2016-17:53;210-AT2 J WAGNER-4/11/2017
-03:47.
"\t200
VFA122_EF\tPE3\tFA-18E\tAMAH\t166438\t3BTWWLG\tPE3129515\t5/9/2017 00:03:21.936\t6/12/2017 16:44:25.29\t020\tDD: 5/10/2017\t14 DAY SPE PERFORM 14 DAY SPECIAL INSP INSPECTION SC 0\t000\t030000A\t09355\t5/9/2017 00:04:31.606\t"PERFORMED 14 DAY SPECIAL INSPECTION: X51-CIV T MILLER-5/18/2017-09:32; 110-AD1 C WANG-6/12/2017-08:23;
230-AO2 C HALE-6/10/2017-08:58; 310-CIV T SMITH-6/10/2017-11:52; 120-AM1 R HOHMANN-6/10/2017-14:31;
13B-AME2 J MARALDO-6/12/2017-15:15; 220-AE2 J CA"\t" STILLO-6/10/2017-14:33
"\t215
"""

let pattern = "[ ]*(?:\"((?:[^\"]|\"\")*)\"|([^\t\"\r\\n]*))[ ]*(\t|\r\\n?|\\n|$)"
let regex = try! NSRegularExpression(pattern: pattern)

var result: [[String]] = []
var record: [String] = []
let offset: Int = 0

regex.enumerateMatches(in: tsvText, options: .anchored, range: NSRange(0..<tsvText.utf16.count)) {match, flags, stop in
    guard let match = match else {fatalError()}
    if let quotedRange = Range(match.range(at: 1), in: tsvText) {
        let field = tsvText[quotedRange].replacingOccurrences(of: "\"\"", with: "\"")
        record.append(field)
    } else if let range = Range(match.range(at: 2), in: tsvText) {
        let field = tsvText[range].trimmingCharacters(in: .whitespaces)
        record.append(field)
    }
    let separator = tsvText[Range(match.range(at: 3), in: tsvText)!]
    switch separator {
    case "": //end of text
        //Ignoring empty last line...
        if record.count > 1 || (record.count == 1 && !record[0].isEmpty) {
            result.append(record)
        }
        stop.pointee = true
    case "\t": //tab
        break
    default: //newline
        result.append(record)
        record = []
    }
}

print(result)

This is not super-efficient, but handles TSV data more appropriately.

Boost

Answer 1

DTS Engineer OP

Apple

Jul ’19

To give you a better answer I’d need to know more about what you’re doing. I can say, however, that Swift is capable of creating arrays of arrays quite quickly. For example, this code:

func test() {
    let start = Date()
    let row = [String](repeating: "", count: 22)
    let matrix = [[String]](repeating: row, count: 6000)
    let end = Date()
    print("'\(matrix[5999][21])'")
    print(end.timeIntervalSince(start))
}

prints this:

''
0.0007610321044921875

That is, it created an empty 6000 x 22 matrix in 0.7 ms.

This is an Debug build created by Xcode 10.2.1 running on 10.14.5 on not fancy hardware (a 2016 MacBook Pro).

Share and Enjoy
—
Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

0

Answer 2

Claude31 OP

Jul ’19

This may be more related to file reading than to the array appending that you have shown.

You need to show more code.

0

Answer 3

SV3021 OP

Jul ’19

Here is the code I am working with. It is in a learning phase and I am sure will get faster as I learn. I put timers in places to find the bottleneck. The read file and rest of the code is pretty fast but line 119 is the bottle neck. I enclosed this chart to show the times over a sample of 6000 records. The next paragraph is a sample of the data which fits the struct in the code below. My current thoughts are to break the single array into buckets of arrays then combine them at the end. If the chart below is an indication I think the speed will come down drastically. Or somebody here shows me the mistake I have made for the slowness. I am running this in a playground currently for testing and realize that has a huge overhead but if it speeds up there it will be extremely fast when compiled.

VFA122_EF PE3 FA-18E AMAH 166438 3BTWY51 PE3135350 5/15/2017 07:16:17.64 6/5/2017 04:08:07.346 020 200 FLIGHT HR SPECIAL INS PERFORM 200 FLIGHT HR SPECIAL INSP INSPECTION SC 0 000 030000K 09355 5/15/2017 07:17:28.7 PERFORMED 200 FLIGHT HR SPECIAL INSPECTION. 120-25MAY17-2136-CIV DENNISON, 220-15MAY17-1845-CIV THOMAS. 215

import UIKit

struct CCdata {
    var org_nm: String?
    var org_cd: String?
    var tms: String?
    var tec: String?
    var serno: String?
    var mcn: String?
    var jcn: String?
    var rcvd_dt: Date?
    var cmp_dt: Date?
    var wc_cd: String?
    var sys_rsn: String?
    var dscrp: String?
    var ty_maf_cd: String?
    var act_tkn_cd: String?
    var mal_cd: String?
    var wuc: String?
    var av_3m_tec: String?
    var create_dt: Date?
    var corr_act: String?
    var corr_act_2: String?
    var pos_cd: String?
    var modex: String?
}

let org_nmIndex = 0; let org_cdIndex = 1; let tmsIndex = 2; let tecIndex = 3; let sernoIndex = 4
let mcnIndex = 5; let jcnIndex = 6; let rcvd_dtIndex = 7; let cmp_dtIndex = 8; let wc_cdIndex = 9
let sys_rsnIndex = 10; let dscrpIndex = 11; let ty_maf_cdIndex = 12; let act_tkn_cdIndex = 13
let mal_cdIndex = 14; let wucIndex = 15; let av_3m_tecIndex = 16; let create_dtIndex = 17
let corr_actIndex = 18; let corr_act_2Index = 19; let pos_cdIndex = 20; let modexIndex = 21

let mafItemsTotal = 22

var line = [[String]]()
var arrayOfStrings: [String]?
var ccDataStr_StrArr = [[String]]()
var ccDataMafs:[CCdata] = [CCdata]()
var ccDataMaf = CCdata()
var MCNDictionary: [String:String] = [:]
var keyExists = false
let tabChar: String = "\t"
var tempHolder: [String]?
var itemHolder: [String]?
var itemCounter: Int?

let mafDateFormatter = DateFormatter()
mafDateFormatter.dateFormat = "MM/dd/yyyy HH:mm:ss.SSS"

func BuildMafData(ccData :inout CCdata, tempStr:[String]) -> String {
    keyExists = MCNDictionary[tempStr[mcnIndex]] != nil
    if tempStr[mcnIndex] != "" && !keyExists {
        MCNDictionary[tempStr[mcnIndex]] = tempStr[jcnIndex]
        ccData.org_nm = tempStr[org_nmIndex]
        ccData.org_cd = tempStr[org_cdIndex]
        ccData.tms = tempStr[tmsIndex]
        ccData.tec = tempStr[tecIndex]
        ccData.serno = tempStr[sernoIndex]
        ccData.mcn = tempStr[mcnIndex]
        ccData.jcn = tempStr[jcnIndex]
        ccData.rcvd_dt = mafDateFormatter.date(from: tempStr[rcvd_dtIndex])
        ccData.cmp_dt = mafDateFormatter.date(from: tempStr[cmp_dtIndex])
        ccData.wc_cd = tempStr[wc_cdIndex]
        ccData.sys_rsn = tempStr[sys_rsnIndex]
        ccData.dscrp = tempStr[dscrpIndex]
        ccData.ty_maf_cd = tempStr[ty_maf_cdIndex]
        ccData.act_tkn_cd = tempStr[act_tkn_cdIndex]
        ccData.mal_cd = tempStr[mal_cdIndex]
        ccData.wuc = tempStr[wucIndex]
        ccData.av_3m_tec = tempStr[av_3m_tecIndex]
        ccData.create_dt = mafDateFormatter.date(from: tempStr[create_dtIndex])
        ccData.corr_act = tempStr[corr_actIndex]
        ccData.corr_act_2 = tempStr[corr_act_2Index]
        ccData.pos_cd = tempStr[pos_cdIndex]
        ccData.modex = tempStr[modexIndex]
        return "Good"
    } else {
        return "Bad"
    }
}

do {
    // This solution assumes  you've got the file in your bundle
    if let path = Bundle.main.path(forResource: "VFA137 CC History", ofType: "txt"){
        let start = CFAbsoluteTimeGetCurrent()
        let data = try String(contentsOfFile:path, encoding: String.Encoding.ascii)
        //print(data)
        let replaced = data.replacingOccurrences(of: "\"", with: "")
        let updated = replaced.replacingOccurrences(of: "\r(?!\n)", with: "", options: .regularExpression)
        arrayOfStrings = updated.components(separatedBy: "\r\n")
        var timer: Int = arrayOfStrings?.count ?? 0
        //ccDataStr_StrArr.reserveCapacity(6000)
        if timer > 200 {
            timer = 6000
        }
        for i in 0..<timer {
            itemHolder = arrayOfStrings![i].components(separatedBy: tabChar)
            //print("I am itemHolder \(itemHolder as Any)")
            itemCounter = itemHolder?.count
            if itemCounter ?? 1 < 1 {
                if tempHolder?.isEmpty ?? true {
                    //print("Look here first \(i + 1)  \(itemCounter ?? 0)")
                    tempHolder = itemHolder
                } else {
                    //print("Look here \(i + 1)  \(itemCounter ?? 0)")
                    //var tryMe = arrayOfStrings![i].components(separatedBy: tabChar)
                    _ = itemHolder?.remove(at: 0)
                    tempHolder! += itemHolder ?? []
                    //print("Look here too \(i + 1)  \(tempHolder!.count)")
                    if tempHolder?.count == 22 {
                        //print("We fixed it")
                        ccDataStr_StrArr.append(tempHolder!)
                        tempHolder = []
                    }
                }
            } else {
                let start = CFAbsoluteTimeGetCurrent()
                ccDataStr_StrArr.append(itemHolder!)
                let diff = CFAbsoluteTimeGetCurrent() - start
                print("\(i) Took \(diff) seconds for \(timer) records and the count is \(ccDataStr_StrArr.count)")
            }
        }
        let diff = CFAbsoluteTimeGetCurrent() - start
        print("Took \(diff) seconds for \(timer) records and the count is \(ccDataStr_StrArr.count)")
    }
} catch let err as NSError {
    // do something with Error
    print(err)
}

0

Answer 4

Claude31 OP

Jul ’19

Could you show the result of

        print("Took \(diff) seconds for \(timer) records and the count is \(ccDataStr_StrArr.count)")

The logic of this part is hard to follow.

Why do you need to remove index 0 ?

            if itemCounter ?? 1 < 1 { 
                if tempHolder?.isEmpty ?? true { 
                    tempHolder = itemHolder 
                } else { 
                    //var tryMe = arrayOfStrings![i].components(separatedBy: tabChar) 
                    _ = itemHolder?.remove(at: 0) 
                    tempHolder! += itemHolder ?? [] 
                    if tempHolder?.count == 22 { 
                        //print("We fixed it") 
                        ccDataStr_StrArr.append(tempHolder!) 
                        tempHolder = [] 
                    } 
                }

You write:

        if timer > 200 {
            timer = 6000
        }
        for i in 0..            itemHolder = arrayOfStrings![i].components(separatedBy: tabChar)

That is surprising. How are you sure arrayOfStrings has more that 6000 elements ?

PS: you have another thread open on a close topic.

Is it solved now ? If no, what is the remaining issue ?

If yes, thanks to close the thread.

0

Answer 5

SV3021 OP

Jul ’19

The chart below shows the values climbing. I have the txt file and have tested extensively to make sure all the data was there. The remove index 0 is due to a few lines come broken up to less than 22 items. So those lines are put back together in that routine. I have set itemCounter to 1<1 so stay out of that routine. I only want to append the final data for testing. I did all this injecting print statements and watching values. The final append is the issue. You can see in the below chart how it ramps up over the 6000. Hope this helps.

0 Took 0.00016796588897705 seconds for 6000 records
1 Took 0.000228047370910644 seconds for 6000 records
2 Took 0.000340938568115234 seconds for 6000 records
3 Took 0.000487089157104492 seconds for 6000 records
4 Took 0.00066995620727539 seconds for 6000 records
5 Took 0.000787019729614257 seconds for 6000 records
6 Took 0.000857949256896972 seconds for 6000 records
7 Took 0.00110197067260742 seconds for 6000 records
8 Took 0.00119590759277343 seconds for 6000 records
9 Took 0.00118494033813476 seconds for 6000 records
10 Took 0.00166797637939453 seconds for 6000 records
11 Took 0.00156891345977783 seconds for 6000 records
12 Took 0.00149297714233398 seconds for 6000 records
13 Took 0.00159096717834472 seconds for 6000 records
14 Took 0.00174891948699951 seconds for 6000 records
15 Took 0.00188899040222167 seconds for 6000 records
16 Took 0.00212502479553222 seconds for 6000 records
17 Took 0.00229895114898681 seconds for 6000 records
18 Took 0.00263392925262451 seconds for 6000 records
19 Took 0.00229406356811523 seconds for 6000 records
20 Took 0.00239098072052001 seconds for 6000 records
21 Took 0.00252807140350341 seconds for 6000 records
22 Took 0.00264704227447509 seconds for 6000 records
23 Took 0.00287497043609619 seconds for 6000 records
24 Took 0.00312709808349609 seconds for 6000 records
3000 Took 0.0745790004730224 seconds for 6000 records
3001 Took 0.0754179954528808 seconds for 6000 records
3002 Took 0.0749650001525878 seconds for 6000 records
3003 Took 0.0751529932022094 seconds for 6000 records
3004 Took 0.0755579471588134 seconds for 6000 records
3005 Took 0.075671911239624 seconds for 6000 records
3006 Took 0.0750819444656372 seconds for 6000 records
3007 Took 0.0757720470428466 seconds for 6000 records
3008 Took 0.0748960971832275 seconds for 6000 records
3009 Took 0.0760709047317504 seconds for 6000 records
3010 Took 0.0757379531860351 seconds for 6000 records
3011 Took 0.0758119821548461 seconds for 6000 records
3012 Took 0.0759049654006958 seconds for 6000 records
3013 Took 0.0753329992294311 seconds for 6000 records
3014 Took 0.0754640102386474 seconds for 6000 records
3015 Took 0.0755339860916137 seconds for 6000 records
3016 Took 0.0767890214920044 seconds for 6000 records
3017 Took 0.0777879953384399 seconds for 6000 records
3018 Took 0.0759029388427734 seconds for 6000 records
3019 Took 0.0758908987045288 seconds for 6000 records
3020 Took 0.0757219791412353 seconds for 6000 records
3021 Took 0.0765930414199829 seconds for 6000 records
3022 Took 0.0758750438690185 seconds for 6000 records
3023 Took 0.0765589475631713 seconds for 6000 records
3024 Took 0.0756800174713134 seconds for 6000 records
5974 Took 0.148245930671691 seconds for 6000 records
5975 Took 0.145865082740783 seconds for 6000 records
5976 Took 0.147732973098754 seconds for 6000 records
5977 Took 0.145815014839172 seconds for 6000 records
5978 Took 0.145707964897155 seconds for 6000 records
5979 Took 0.145455002784729 seconds for 6000 records
5980 Took 0.145587921142578 seconds for 6000 records
5981 Took 0.145675897598266 seconds for 6000 records
5982 Took 0.147395014762878 seconds for 6000 records
5983 Took 0.149832010269165 seconds for 6000 records
5984 Took 0.146044015884399 seconds for 6000 records
5985 Took 0.152476072311401 seconds for 6000 records
5986 Took 0.145820021629333 seconds for 6000 records
5987 Took 0.145681023597717 seconds for 6000 records
5988 Took 0.152094006538391 seconds for 6000 records
5989 Took 0.146000027656555 seconds for 6000 records
5990 Took 0.14866292476654 seconds for 6000 records
5991 Took 0.14999496936798 seconds for 6000 records
5992 Took 0.148908972740173 seconds for 6000 records
5993 Took 0.149780988693237 seconds for 6000 records
5994 Took 0.147336959838867 seconds for 6000 records
5995 Took 0.148766994476318 seconds for 6000 records
5996 Took 0.152781009674072 seconds for 6000 records
5997 Took 0.147929906845092 seconds for 6000 records
5998 Took 0.146328926086425 seconds for 6000 records
5999 Took 0.146076917648315 seconds for 6000 records

0

Answer 6

OOPer OP

Jul ’19

I am running this in a playground currently for testing

You should better know one thing

Playground is not a good place to test performance.

It does many, many things behind the scene at execution of each line, so your Took-time is measuring the time of such Playground efforts, not the time of `append`.

I modified your code as Command Line Tool app and tested with my Mac mini, and the result shown was...

...

5990 Took 9.5367431640625e-07 seconds for 6000 records and the count is 5991

5991 Took 0.0 seconds for 6000 records and the count is 5992

5992 Took 0.0 seconds for 6000 records and the count is 5993

5993 Took 0.0 seconds for 6000 records and the count is 5994

5994 Took 0.0 seconds for 6000 records and the count is 5995

5995 Took 0.0 seconds for 6000 records and the count is 5996

5996 Took 0.0 seconds for 6000 records and the count is 5997

5997 Took 0.0 seconds for 6000 records and the count is 5998

5998 Took 0.0 seconds for 6000 records and the count is 5999

5999 Took 0.0 seconds for 6000 records and the count is 6000

Took 0.14691698551177979 seconds for 6000 records and the count is 6000

Your code is far from efficient, but if you run it in a more appropriate way, the time would be different.

0

Answer 7

SV3021 OP

Jul ’19

That is good to know. What did you use for the 6000 records? If you have some recommendations I would like to hear them. I am working on learning as I go. This was actually a project I built in MS Access a few years ago.

0

Answer 8

OOPer OP

Jul ’19

What did you use for the 6000 records?

I used the example line repeatedly.

I guess you would want to make an Array of CCData from your "VFA137 CC History.txt". (Or you may have many other input files with the same format.)

Then you should better go back to your previous thread and continue it with some additional info:

- Actual input: a few lines of your txt file would do, including some lines which explains why you need `if itemCount...` block is needed, would be better (Unmodified)

- Desired output: Array of CCData

0

Answer 9

SV3021 OP

Jul ’19

The reason is that the text file is generated from another system. That system will break some lines from 22 items to another number. I use this routine to put the lines back together before appending it to the final CCData. Below are some sample lines that show this issue. Line 1,2 and 3 belong together and 4,5,6 and 7 also belong together. Line 8 is ok they way it is.

VFA122_EF PE3 FA-18E AMAH 165897 3BTVMMF PE3336110 12/1/2016 11:01:42.71 4/11/2017 09:00:43.63 020 84 DAY SPECIAL CORROSION COMPLY WITH 84 DAY SPECIAL CORROSION INSPECT(ATFLIR) INSPECTION ON [ATFLIR, E/F AN/ASQ-228(v)2 - FRP403] IAW AW-228AC-MRC-300 SC 0 000 030000F 09355 12/1/2016 11:01:59.85 "COMPLIED WITH 84 DAY SPECIAL INSPECTION: 120-PO2 C BLAIS-12/27/2016-17:53;210-AT2 J WAGNER-4/11/2017
-03:47.
" 200
VFA122_EF PE3 FA-18E AMAH 166438 3BTWWLG PE3129515 5/9/2017 00:03:21.936 6/12/2017 16:44:25.29 020 DD: 5/10/2017 14 DAY SPE PERFORM 14 DAY SPECIAL INSP INSPECTION SC 0 000 030000A 09355 5/9/2017 00:04:31.606 "PERFORMED 14 DAY SPECIAL INSPECTION: X51-CIV T MILLER-5/18/2017-09:32; 110-AD1 C WANG-6/12/2017-08:23;
 230-AO2 C HALE-6/10/2017-08:58; 310-CIV T SMITH-6/10/2017-11:52; 120-AM1 R HOHMANN-6/10/2017-14:31;
 13B-AME2 J MARALDO-6/12/2017-15:15; 220-AE2 J CA" " STILLO-6/10/2017-14:33
" 215
VFA122_EF PE3 FA-18E AMAH 166438 3BTWY51 PE3135350 5/15/2017 07:16:17.64 6/5/2017 04:08:07.346 020 200 FLIGHT HR SPECIAL INS PERFORM 200 FLIGHT HR SPECIAL INSP INSPECTION SC 0 000 030000K 09355 5/15/2017 07:17:28.7 PERFORMED 200 FLIGHT HR SPECIAL INSPECTION. 120-25MAY17-2136-CIV DENNISON, 220-15MAY17-1845-CIV THOMAS. 215

0

Answer 10

OOPer OP

Jul ’19

Accepted Answer

It is not clear where is TAB in your newly shown text example, but is looks like a Tab-Separated Value where some items can contain line breaks when enclosed in double-quotes.

If you are working such data as I guess, counting 22 is not a good way to handle it.

You can try something like this:

import Foundation

let tsvText = """
VFA122_EF\tPE3\tFA-18E\tAMAH\t165897\t3BTVMMF\tPE3336110\t12/1/2016 11:01:42.71\t4/11/2017 09:00:43.63\t020\t84 DAY SPECIAL CORROSION COMPLY WITH 84 DAY SPECIAL CORROSION INSPECT(ATFLIR) INSPECTION ON [ATFLIR, E/F AN/ASQ-228(v)2 - FRP403] IAW AW-228AC-MRC-300 SC 0 000\t030000F\t09355\t12/1/2016 11:01:59.85\t"COMPLIED WITH 84 DAY SPECIAL INSPECTION: 120-PO2 C BLAIS-12/27/2016-17:53;210-AT2 J WAGNER-4/11/2017
-03:47.
"\t200
VFA122_EF\tPE3\tFA-18E\tAMAH\t166438\t3BTWWLG\tPE3129515\t5/9/2017 00:03:21.936\t6/12/2017 16:44:25.29\t020\tDD: 5/10/2017\t14 DAY SPE PERFORM 14 DAY SPECIAL INSP INSPECTION SC 0\t000\t030000A\t09355\t5/9/2017 00:04:31.606\t"PERFORMED 14 DAY SPECIAL INSPECTION: X51-CIV T MILLER-5/18/2017-09:32; 110-AD1 C WANG-6/12/2017-08:23;
230-AO2 C HALE-6/10/2017-08:58; 310-CIV T SMITH-6/10/2017-11:52; 120-AM1 R HOHMANN-6/10/2017-14:31;
13B-AME2 J MARALDO-6/12/2017-15:15; 220-AE2 J CA"\t" STILLO-6/10/2017-14:33
"\t215
"""

let pattern = "[ ]*(?:\"((?:[^\"]|\"\")*)\"|([^\t\"\r\\n]*))[ ]*(\t|\r\\n?|\\n|$)"
let regex = try! NSRegularExpression(pattern: pattern)

var result: [[String]] = []
var record: [String] = []
let offset: Int = 0

regex.enumerateMatches(in: tsvText, options: .anchored, range: NSRange(0..<tsvText.utf16.count)) {match, flags, stop in
    guard let match = match else {fatalError()}
    if let quotedRange = Range(match.range(at: 1), in: tsvText) {
        let field = tsvText[quotedRange].replacingOccurrences(of: "\"\"", with: "\"")
        record.append(field)
    } else if let range = Range(match.range(at: 2), in: tsvText) {
        let field = tsvText[range].trimmingCharacters(in: .whitespaces)
        record.append(field)
    }
    let separator = tsvText[Range(match.range(at: 3), in: tsvText)!]
    switch separator {
    case "": //end of text
        //Ignoring empty last line...
        if record.count > 1 || (record.count == 1 && !record[0].isEmpty) {
            result.append(record)
        }
        stop.pointee = true
    case "\t": //tab
        break
    default: //newline
        result.append(record)
        record = []
    }
}

print(result)

This is not super-efficient, but handles TSV data more appropriately.

0

Answer 11

amymariaparker OP

Jul ’19

Thanks

0

Answer 12

SV3021 OP

Jul ’19

Sorry it took so long to respond. I had to do some studying to understand what this did. It is very fast. 79 seconds for the orginal file. I did change a line that didn't make since to me as I couldn't find an instance of it doing anything. The new line did fix some issues showing up though. Thank you so much for the great learning experience.

let field = tsvText[quotedRange].replacingOccurrences(of: "\"\"", with: "\"")

let field = tsvText[quotedRange].replacingOccurrences(of: "[\\t\\n\\r]", with: "", options: .regularExpression)

0

Answer 13

OOPer OP

Jul ’19

Thanks for reporting. I'm very happy if some parts of my code could help to solve your issue.

And I will have in mind that you needed some fixes, thanks for sharing your experience.

0