Substitutions in strings

I'm trying to figure out how to do some substitutions in word strings, so that the pattern of repeated letters will be revealed. For example, these words:

dog

cat

man

would all have the same pattern "123", since no letters are repeated. But the words:

add

bee

would both have the pattern "122", since their second letter is repeated in the third position. Similarly, the words gag and eye would both have the pattern "121", since the first letter is repeated in the third position. The words away and baby would have the pattern "1214", and so on.


I've been trying for hours to figure out how to do this. I'm sure seeing a professional solution would teach me some very interesting things about Swift!

That should do the job:


let text = "dogged"     // Any exazmple text you want

var firstOccurence : [String: String] = [:]    // Keep first occurence of a letter
var pattern = ""    // pattern will be 123341
var indexInPattern = 0       // To be used in pattern to write 123341


for c in text {
    let s = String(c)     // Convert to string
    if firstOccurence[s] == nil {   // Letter not yet found
        indexInPattern += 1
        let indexString = String(indexInPattern)
        firstOccurence[s] = indexString
        pattern = pattern + indexString
    } else {    // The letter was found already
        pattern = pattern + firstOccurence[s]!
    }
}
print(pattern)


To test on your examples, I created an array of String:

let texts = ["dog", "cat", "man", "add", "bee", "dogged"]

for text in texts {
    var firstOccurence : [String: String] = [:]    // Keep first occurence of a letter
    var indexInPattern = 0       // To be used in pattern to write 123341
    var pattern = ""    // pattern will be 123341

    for c in text {
        let s = String(c)
        if firstOccurence[s] == nil {   // Letter not yet found
            //        firstOccurence[s] = pos
            indexInPattern += 1
            let indexString = String(indexInPattern)
            firstOccurence[s] = indexString
            pattern = pattern + indexString
        } else {    // The letter was found already
            pattern = pattern + firstOccurence[s]!
        }
    }
    print(pattern)
}


Result is as expected (but not exactly what you described for away):

dog 123

cat 123

man 123

add 122

bee 122

gag 121

eye 121

away 1213

baby 1213


So, to get exactly your spec, changed as:

for text in texts {
    var firstOccurence : [String: String] = [:]    // Keep first occurence of a letter
    var pattern = ""    // pattern will be 123341
   
    for (pos, c) in text.enumerated() {
        let s = String(c)
        if firstOccurence[s] == nil {   // Letter not yet found
            let posString = String(pos+1)
            firstOccurence[s] = posString
            pattern = pattern + posString
        } else {    // The letter was found already
            pattern = pattern + firstOccurence[s]!
        }
    }
    print(text, pattern)
}


This now gives:

dog 123

cat 123

man 123

add 122

bee 122

gag 121

eye 121

away 1214

baby 1214



And with : "🐶🐱🐱🐶🐱"

You get

🐶🐱🐱🐶🐱 12212



One could use map functions on text, but I find it a bit overkill here.

Oooh, this is like doing someone’s programming homework (-:

Here’s a functional version of this, just to entertain Claude31 (-:

func patternForLetters(_ input: String) -> [Int] {
    let firstIndexForLetter = input.enumerated().reduce([:]) { (soFar, offsetAndLetter) -> [Character:Int] in
        let offset = offsetAndLetter.offset
        let letter = offsetAndLetter.element
        return soFar.merging([letter:offset + 1], uniquingKeysWith: { l, r in return l })
    }
    return input.map { firstIndexForLetter[$0]! }
}

for s in ["dog", "cat", "man", "add", "bee", "gag", "eye", "away", "baby"] {
    print(s, "->", patternForLetters(s))
}
// prints:
// dog -> [1, 2, 3]
// cat -> [1, 2, 3]
// man -> [1, 2, 3]
// add -> [1, 2, 2]
// bee -> [1, 2, 2]
// gag -> [1, 2, 1]
// eye -> [1, 2, 1]
// away -> [1, 2, 1, 4]
// baby -> [1, 2, 1, 4]

Note that I’m returning an array of integers because if you return a string then the function produces meaningless results for long inputs.

print(patternForLetters("abcdefghiaj").map { "\($0)" }.joined() )
// prints: 123456789111
print(patternForLetters("abcdefghiaaa").map { "\($0)" }.joined() )
// prints: 123456789111

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

I was sure you would find a very elegant solution with advanced functions !


However, it requires some practice to understand the code.


You're right, above 9 the result is meaningless.

So there are several options if we need to return a String;

- use a letter above 9 (A, B, C…)

- separate with dashes (not very cool).


Here is (my poor 😁) revised version, just for fun:


extension Int {
    func toAlphanum() -> String {
        if self <= 0 { return "" }
        if self > 36 { return "@"}
        if self < 10 {
            return String(self)
        } else {
            let u = UnicodeScalar(self + 55)    // 10 -> A
            // Convert UnicodeScalar to a Character.
            let char = Character(u!)
            return String(char)
        }
    }
}

for text in texts {
    var firstOccurence : [String: Int] = [:]    // Keep first occurence of a letter
    var pattern = ""    // pattern will be 123341
 
    for (pos, c) in text.enumerated() {
        let s = String(c)
        if firstOccurence[s] == nil {   // Letter not yet found
            firstOccurence[s] = pos+1
            pattern = pattern + (pos+1).toAlphanum()
        } else {    // The letter was found already
            pattern = pattern + firstOccurence[s]!.toAlphanum()    // firstOccurence[s]!
        }
    }
    print(text, pattern)
}


"I'm so happy just to dance with you" gives (Albinus did not specify if lower/upper case did matter)

I'm so happy just to dance with you 123456489AAC4EF5H4H64M9OPQ4STH84C6F

Thanks for the suggestions. After much tearing of hair and rending of garments, I finally came up with this:


var substitutionString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
var patternString = word.lowercased()
for char in patternString {
  patternString = patternString.replacingOccurrences(of: String(char), with: String(substitutionString.first ?? "*"))
  substitutionString.removeFirst()
  }

Fortunately all possible values for word are less than 26 characters long.


Now I'm wondering how to use the pattern data from a list of >100K words. I thought of making a dictionary with the words as keys and the patterns as values. But since I want to extract lists of words that fit a given pattern, that seems to be inefficient. Would it be simpler to generate it as an array of tuples of the form (pattern, word) instead?

I'm not cheating on my homework! I'm just trying to get back into Swift programming after being away from it for a couple of years. Your help is much appreciated.

Note: your code could crash if more that 26 symbols. Should add a test.


Anyway, you could gnerate a dictionary for patterns, containing all the words for each pattern,


var dictOfPatterns = [String: [String]]()


in the loop where you compute patterns, add the generation of dictionary


words is the array of 100K words

for word in words {     // The 100 KWords
    var substitutionString = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
    var patternString = word.lowercased()
    for char in patternString {
        patternString = patternString.replacingOccurrences(of: String(char), with: String(substitutionString.first ?? "*"))
        if substitutionString == "" { break }     // To avoid crash
        substitutionString.removeFirst()
    }
    if dictOfPatterns[patternString] == nil {
        dictOfPatterns[patternString] = [word]
    } else {
        dictOfPatterns[patternString]!.append(word)
    }
}
print(dictOfPatterns)

For

["dog", "cat", "man", "add", "bee", "gag", "eye", "away", "baby", "🐶🐱🐱🐶🐱"]

You get :

["ABA": ["gag", "eye"], "ABB": ["add", "bee"], "ABAD": ["away", "baby"], "ABBAB": ["🐶🐱🐱🐶🐱"], "ABC": ["dog", "cat", "man"]]

Albinus wrote:

I'm not cheating on my homework!

I’m sorry I gave that impression. My comment was not meant to be taken seriously.

Claude31 wrote:

Anyway, you could gnerate a dictionary for patterns, containing all the words for each pattern

I’d do this using dictionary’s grouping support. Imagine you have a sequence of values:

let values = [5, 2, 1, 3, 4]

and a function that returns a group for each value (in this case there are two groups,

even
and
odd
):
func groupForValue(_ i: Int) -> String {
    if i.isMultiple(of: 2) {
        return "even"
    } else {
        return "odd"
    }
}

You can group the values in the sequence like so:

let groups = Dictionary(grouping: values, by: groupForValue)
print(groups)   // prints: ["even": [2, 4], "odd": [5, 1, 3]]

Share and Enjoy

Quinn “The Eskimo!”
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

<< I’m sorry I gave that impression. My comment was not meant to be taken seriously. >>


I was aware of that... I was using a whimsical "!", not an offended one! <=(whimsical)

We're all playing a little bit here. That's refreshing.


But, most important, have you now an answer to your initial question(s) ?

Yes! And I now have some handy code snippets to refer to in times of trouble. Thanks to all who contributed.

Substitutions in strings
 
 
Q