Subscripting into array with int-like variable

Question

Created Jun ’15

Replies 13

Boosts 0

Views 4.5k

Participants 5

Indexing into arrays is only possible using the Int type.

As Int is 64-bit, you would assume that indexing into an array would also work using UInt8 for example, as it is only 8 bit. But there will be a compiler error.

This is expected, as the Array struct defines subscripting as:

subscript (index: Int) -> T

I assume that smart people have thought about this, and the decision to make this index an Int has good reasons.

But why not allow indexing via 'UnsignedIntegerType' for example? This would allow for indexing into an array using

any of the Integer-types and then there is not need to create separate Ints via Int(UInt16-value).

Can anyone elaborate?

Code examples of errors and how to fix/workaround them:

// the count is larger then what can be represented by UInt8
var int8Array : [UInt8] = [UInt8](count: 2^12, repeatedValue: 0)

let index8 : UInt8 = 12
let valueAtIndex8 = int8Array[index8]
// >> error: 'Cannot subscript a value of type '[UInt8]' with an index of type 'UInt8'

let index16 : UInt16 = 12
let valueAtIndex16 : UInt8 = int8Array[index16]
// >> error: 'Cannot subscript a value of type '[UInt8]' with an index of type 'UInt16'

let index64 : UInt64 = 12
let valueAtIndex64 = int8Array[index64]
// >> error: 'Cannot subscript a value of type '[UInt8]' with an index of type 'UInt64'

// No error here, but an extra operation is needed
let index16AsInt : Int = Int(index16)
let valueAtIndex16AsInt = int8Array[index16AsInt]

// No error here
let indexInt : Int = 12 // A 64-bit signed integer value type
let valueAtIndexInt : UInt8 = int8Array[indexInt]

Boost

Answer 1

oisdk OP

Jun ’15

Accepted Answer

It's probably for the sake of simplicity. Normal Ints are far more common - in my code, at least - than UInts, or Int8 and so on. So the type of integer that would least likely require a conversion would be Int. However, you can absolutely add your own subscripting:

extension Array {
  subscript(i: UInt) -> T {
    get {
      return self[Int(i)]
    } set(from) {
      self[Int(i)] = from
    }
  }
}
let unsigned: UInt = 3
let ar = [1, 2, 3, 4, 5]
ar[unsigned] // 4

I think the default is Int, though, because that's what people would be most familiar with, and would most commonly use.

0

Answer 2

Wallacy OP

Jun ’15

You can not store more than Int.max in the array anyway. In addition to various other problems related to UInt (mainly operations).

0

Answer 3

Joride OP

Jun ’15

I would argue that the more Swift way of declaring something is by using a protocol-type, and then the one that is most generic. But maybe you are right.

Normally I would not even notice, as we deal mostly with Ints. In this case however, I am working on a piece of code that really needs to be as fast as possible, and deals with UInt8, UInt16 etc. That's why I was wondering.

Maybe Swift is not yet meant for performant low-level code, and I should stick to (Objective-)C for the critical parts if this turns out to be the case.

0

Answer 4

Joride OP

Jun ’15

I know I cannot store more then Int.max in the array. Int is 64 bit. If I can use that to index into an array, why not with UInt8 (it's way smaller then Int and can only index into the first 256 elements of any array). So I don't see how the fact that an array can not be larger then Int.max is related to my question.

Could you explain what you mean with 'various other problems'?

0

Answer 5

oisdk OP

Jun ’15

You could use protocols for array indexing, but I'd imagine that the nature of Int is tied up in the implementation of array. As in: you could have subscript just be IntegerType, but you might still end up having to convert to the base type.

0

Answer 6

Joride OP

Jun ’15

Good point. I wonder anyway how costly Int(someValue) is. Maybe the standard library / compiler can handle this case in a smart way (remember those tagged pointers in good ol' Objective-C ? 😉 ).

0

Answer 7

DTS Engineer OP

Apple

Jun ’15

I know this doesn't add much to the discussion (sorry!) but I want to make sure that we're all on the same page here.

Int is 64 bit.

... on 64-bit platforms. On 32-bit platforms (older iOS hardware, watchOS) Int is 32 bits.

Share and Enjoy
—
Quinn "The Eskimo!"
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1@apple.com"

0

Answer 8

Joride OP

Jun ’15

Thanks for clearing that up, since in the header files it says:

/// A 64-bit signed integer value
/// type.
struct Int : SignedIntegerType {
...

0

Answer 9

DTS Engineer OP

Apple

Jun ’15

Indeed. That's an artefact of the way that your app is built; if you configure your project to just build "armv7", then command clicking on

Int

will take you to a 32-bit declaration:

/// A 32-bit signed integer value
/// type.
struct Int : SignedIntegerType {
    var value: Builtin.Int32
...

*phew* You had me doubting myself for a minute there (-:

Share and Enjoy
—
Quinn "The Eskimo!"
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1@apple.com"

0

Answer 10

DTS Engineer OP

Apple

Jun ’15

Oh, and I filed a bug against that comment to see if we can avoid misleading other folks in the future.

Share and Enjoy
—
Quinn "The Eskimo!"
Apple Developer Relations, Developer Technical Support, Core OS/Hardware

let myEmail = "eskimo" + "1" + "@apple.com"

0

Answer 11

ahltorp OP

Jun ’15

I don't know what kind of performance gains you hope for by using 8-bit addresses, but consider this code:

var array : [UInt8] = []


public func foo(i: Int) -> UInt8 {
     return array[i + 2]
}


public func foo8(i: UInt8) -> UInt8 {
     return array[Int(i + 2)]
}

This compiles to:

  pushq %rbp
  movq %rsp, %rbp
  addq $2, %rdi
  jo LBB1_3
  movq __Tv8test8bit5arrayGSaVSs5UInt8_(%rip), %rax
  cmpq 16(%rax), %rdi
  jae LBB1_3
  movb 32(%rdi,%rax), %al
  popq %rbp
  retq
LBB1_3:
  ud2

and

  pushq %rbp
  movq %rsp, %rbp
  addb $2, %dil
  jb LBB2_3
  movzbl %dil, %eax
  movq __Tv8test8bit5arrayGSaVSs5UInt8_(%rip), %rcx
  cmpq 16(%rcx), %rax
  jae LBB2_3
  movb 32(%rax,%rcx), %al
  popq %rbp
  retq
LBB2_3:
  ud2

The only differences are that the add is 8-bit instead of 64-bit and that the 8-bit value is zero-extended to a 64-bit value before it is used to index the array. Not zero-extending an 8-bit value before using it as an offset would not work, since I don't believe that the x86-64 architecture has instructions for combining addresses with different bit lengths. I don't know about Arm, but I guess that it will not be faster to use 8-bit addressing there either.

0

Answer 12

Joride OP

Jun ’15

This is very interesting. The thing is that I have an array of UInt8 variables (this requirement is fixed). Some of these I need to use as an index into another array. That is how I ran into this issue.

So, if I understand you correctly, using anything other then CPU-sized integers (32 / 64 bit) will require extra operations. If that is the case, converting from UIntX to Int would require an equal amoung of operations as when it would be possible to use UIntX to index into arrays, as those latter ones would still require extra operations. And also, this is the case for either Swift, C, Objectice-C and C++ then.

0

Answer 13

ahltorp OP

Jun ’15

It could still be useful to keep them as UInt8 as long as possible since the optimizer then has access to 4 extra registers (AH through DH) on x86-64, but I don't know if that is relevant in practice.

The zero-extension instructions probably have a very small impact on performance, especially compared to the many jumps that are spread throughout the code for bounds checking. If you are very worried about performance, you can compile that specific code as unchecked (gets rid of the jumps), but I don't know if it is possible to do that without putting all unchecked code in a separate module.

If you are comfortable reading assembly code, I would recommend that you look at the generated code of the part of the program your profiler tells you is the most CPU intensive. And remember, don't optimize too early, and don't optimize without measuring. I have crypto code in Swift that is very near my C code in performance, but then I have disabled bounds checking.

0