For-in-loop on custom types for which SequenceType doesn't make sense.

Question

Created Nov ’15

Replies 10

Boosts 0

Views 5.4k

Participants 4

As far as I can see, conforming to SequenceType is the only way to get a type to work with a for-in-loop.

However, SequenceType adds a lot of stuff that doesn't necessarily make sense for every custom type we might want to use with a for-in-loop.

(SequenceType's versions of map (always to an Array!), filter (to an Array!), sort (to an Array!) etc.)

So why isn't there a much simpler protocol that specify/require only what is needed for the for-in-loop to work?

(I guess such a protocol would require nothing more than a generate-method from its conforming types.

And SequenceType would then of course inherit from that protocol.)

Boost

Answer 1

Jessy OP

Nov ’15

I don't think it can be simplified. Do you have ideas?

AnyGenerator is handy for not implementing a Generator.

struct IntegerSequence: SequenceType {
  init(limit: Int) {
    Iterator = IntegerSequence.Iterator_init(limit: limit)
  }

  private let Iterator: () -> AnyGenerator<Int>
  private static func Iterator_init(limit limit: Int)() -> AnyGenerator<Int> {
    var iteration = 0
    return anyGenerator {
      iteration++ < limit ? iteration : nil
    }
  }

  func generate() -> AnyGenerator<Int> {return Iterator()}
}

I have logged a bug about not having property initializers, which would allow us to clean it up:

struct IntegerSequence: SequenceType {
  init(limit: Int) {
    Iterator.init(limit: limit)
  }

  private let Iterator: () -> AnyGenerator<Int> {
  (limit: Int)() in
    var iteration = 0
    return anyGenerator {
      iteration++ < limit ? iteration : nil
    }
  }

  func generate() -> AnyGenerator<Int> {return Iterator()}
}

0

Answer 2

jawbroken OP

Nov ’15

Is this just a purity/pollution argument? As you likely know, you already only need to provide a generate() method to conform to SequenceType. I suppose if you want to provide your own map that preserves the type then the disambiguation could be annoying. There's always a tradeoff between making protocols more granular and how easy it is for people to find things in the documentation, etc, I guess.

0

Answer 3

Jens OP

Nov ’15

If such a simpler protocol was factored out like this:

public protocol GeneratorProducingType {
    typealias Generator : GeneratorType
    @warn_unused_result
    public func generate() -> Self.Generator
}

public protocol SequenceType : GeneratorProducingType {
    ...
}

Then there would be no tradeoff.

SequenceType is effectively unchanged, but there's now an option to conform only to GeneratorProducingType.

And the for-in statement would work on all types that conform to GeneratorProducingType (which I'm sure could be better named).

The purpose would simply be to isolate the "for-in-loopable-capability" from a lot of (in the context of some particular types) too Array-centric methods.

So yes, I suppose you could say that it's "just" a purity/pollution argument, but all I really want is to be able to write this:

for n in someInstanceOfMyType {
   ...
}

instead of this:

var generator = someInstanceOfMyType.generate()
while n = generator.next() {
   ...
}

and currently, if I want that, the only option is to "pollute" MyType with all that extra stuff that SequenceType adds (which may make perfect sense for most types, but not for some).

0

Answer 4

jawbroken OP

Nov ’15

Of course there is a tradeoff. I'm not arguing that the tradeoff isn't worthwhile in this case, just stating it exists. I don't believe that every protocol in the standard library should be broken out into one protocol per function and then recombined in various combinations, and you probably don't either.

I only meant “just” in the sense of “is there a more subtle point that I'm missing”, not to belittle your concern. I do see that it currently conflicts with providing your own functionality with the same name, which requires some annoying disambiguation or ugly workarounds like renaming functions. On balance I would be fine with your proposed modification.

0

Answer 5

Jens OP

Nov ’15

Ok, thanks.

I see what you meant by tradeoff now (but I couldn't see the disadvantage in this case).

It seems only natural for the std lib (and other libs) to evolve into being more compositional (ArithemticType / NumericType / ScalarType, anyone?), ie more parts being expressed as recombinations of other parts. More granular and more reusable doesn't necessarily mean more parts. On the contrary, when done right it might mean fewer and better reused parts, which imply fewer and more nicely related concepts to explain and learn.

0

Answer 6

Jessy OP

Nov ’15

I'm trying to understand where you're coming from, but as far as I can tell, it just seems like you don't want concrete types returned from SequenceType methods. That's not a problem with SequenceType; that's just a feature that hasn't been implemented in Swift yet.

I'm used to appending .ToArray() or .ToList() to sequences. That works fine. I do think that returning Self probably makes more sense than returning SequenceType where Generator.Element == Generator.Element, but that wasn't how I've seen it done, so I don't know if problems are associated with returning Self.

0

Answer 7

Jens OP

Nov ’15

I agree that eg the map of SequenceType should return Self or rather Self<ElementTypeAsReturnedByTransform> and I understand that this is not (yet?) possible since Swift's type system isn't powerful enough to express a Mappable/Functor protocol. (I'm not sure that it ever will because I think there might be complications when it comes to type inference etc, but I don't know anything about these things.)

That aside, what I'm asking for here is simply a way to use a custom type with a for-in-loop without having to conform to SequenceType (because I don't want my custom type to be "polluted" by all the stuff that comes with SequenceType conformance), hence the proposed GeneratorProducingType protocol above.

0

Answer 8

Jens OP

Nov ’15

Another way to explain why I'd like to be able to use my type(s) with for-in without getting all stuff from SequenceType et al might be by this example:

TL;DR:

My StaticArrayType(s) has type-level Count, thus SequenceType == bad fit, OK, fine, makes sense. But I'd still like to:

for e in myStaticArray { … }

I could have, if only the stdlib had a simpler protocol whose only requirement was that conforming types had a generate() method.

I have a protocol called StaticArrayType which describes a set of common requirements for concrete (value) types that have/are a statically allocated fixed number of elements (of type Element). So the memory layout of any static array type is fixed and both its element type and its count are encoded at the type level by associated types
Element: DefaultInitializable and
Count: CountType.
The Count type can be used to type check the number of elements and the Element type can be used to type check the element type.

(The memory layout of any static array type is fixed and "known" since eg a static-array-type of 4 Floats (StaticArray<Float>.WithCount4), will have a strideof that is 4 * strideof(Float) == 16 bytes.)

These types are all subscriptable, have count and indices properties etc. Much like a MutableCollectionType. I have working code for this, and it's a very useful zero cost abstraction (the optimizer generates code that is the same as if I had used corresponding simd types, etc). However:

As already mentioned, the StaticArrayType protocol can be made to ineherit from eg MutableCollectionType, but I decided not to, for reasons like these:

- SubSequence doesn't make sense for StaticArrayType (making sense would mean being a different type for each different length/count).

- The same is true for eg map, filter, etc. (They would have to return different types depending on the count of the result).

Put differently: As the count (number of elements) of a StaticArrayType is a type-level thing, it doesn't make sense to mix it with constructs in which the count is a runtime thing (such as CollectionType, SubSequence and SequenceType).

So it ends up being better to redefine any possibly relevant similar concepts (eg SubSequence, MutableSlice, map, filter etc) specifically for StaticArrayType.

Note that I am totally fine with not having my StaticArrayType protocol inherit from CollectionType/SequenceType, as it makes perfect sense (for the reasons outlined above).

BUT this also has one (somewhat surprising) consequence, namely that I will not be able to write this:

1. for e in myStaticArray { ... use e in some (non-mutating) way ... }

but instead I have to write this:

2. for i in myStaticArray.indices { ... use myStaticArray[i] in some (non-mutating) way ... }

So why shouldn't my StaticArrayType be able to inherit from some less bloated / presumptuous protocol that simply had _one_ requirement:

That the type has a generate method and thus can be used in a for-in statement like (1) above.

?

Or: Why does for-in statements like (1) require SequenceType conformance when the only thing they actually need is a generate() method?

0

Answer 9

GSnyder OP

Nov ’15

It's inherent in iteration that clients have access to individual elements. There's nothing to stop a client from, say, accumulating the elements into a standard Array. The stdlib SequenceType extensions are really just shortcuts for that sort of thing, one-liners you could write yourself but which would create clutter if you did. As a SequenceType client, I don't have the expectation that every type will implement every extension in a type-optimal way (though it's nice to know that it could!).

If I understand your viewpoint, you'd like to forbid the operations you don't want to implement in a type-specific way, lest someone use them without realizing that they've gone off the rails. But philosophically, I'd prefer you not obstruct me in this way without a specific motivation beyond "it won't be optimized efficiently."

That said, if you must lock down the type, an iteration-specific subprotocol seems like a reasonable solution. Currently, there does't seem to be any way of doing this cleanly. I'd vote for an underbar-prefixed name, though -- the risk of people mistakenly latching onto an iteration protocol as the default is relatively high.

0

Answer 10

Jens OP

Nov ’15

My viewpoint: I have a type, eg my StaticArrayType, and I don't want it to conform to SequenceType (1), BUT I still want to be able to:

for e in myStaticArrayType { ... e ... }

(1) Because SequenceType et al assumes count is a runtime thing (which is totally ok for most sequences) but it makes a bad fit for eg my StaticArrayType type(s) in which Count is a type-level thing, and it's not a bad fit for performance reasons, it's because almost everything that is required by and comes with SequenceType (1a) becomes just confusing clutter in the context of my static array types. SubSequence is a particularly good example of something that just doesn't make any sense at all in this context, as mentioned in my post above.

(1a) Except generate, subscript, indices and count (yes, it still needs an Int representation of the CountType).

Yes, I wish the std lib would factor out a subprotocol of SequenceType. But I'm not sure about the underscore, I'd prefer a good name and good documentation that makes it easy to decide whether you want eg MutableCollectionType, CollectionType, SequenceType or perhaps just GeneratorProducingType (but better named).

For the time being, I added a computed property to the StaticArrayType protocol that returns a StaticArraySequence<T: StaticArrayType>, so I can write it like this:

for e in myStaticArrayType.sequence { ... e ... }

That way, the clutter is isolated under that computed property, which has no other real use than in such for-in statements.

0