Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Containers #2172

Closed
p6rt opened this issue Sep 16, 2010 · 5 comments
Closed

Containers #2172

p6rt opened this issue Sep 16, 2010 · 5 comments
Labels

Comments

@p6rt
Copy link

p6rt commented Sep 16, 2010

Migrated from rt.perl.org#77896 (status was 'resolved')

Searchable as RT77896$

@p6rt
Copy link
Author

p6rt commented Sep 16, 2010

From @Kodiologist

This patch adds a bunch of container types and lets you do 'my %h is
SomeType'. See the commit message for the details.

@p6rt
Copy link
Author

p6rt commented Sep 16, 2010

From @Kodiologist

containers.patch
From a010550295d5ef9392e5cc0e60dd120fb0fe79e8 Mon Sep 17 00:00:00 2001
From: Kodi Arfer <hippo@Thoth.(none)>
Date: Thu, 16 Sep 2010 09:14:23 -0400
Subject: [PATCH] Containers galore.

I added the Bag, KeyHash, KeySet, and KeyBag types and modified Set so
that it can pass S02-builtin_data_types/set.t. I also edited
Actions.pm so you can say 'my %h is SomeType', although this only
works for variables with the % sigil, and the type has to be just a
type name, so for now if you want to say 'my %h is KeyHash[Int, 42]'
you have to instead say 'role R does KeyHash[Int, 42]; my %h is R'.

To make it possible to implement KeyHashes, I implemented a crude
means of overloading &infix:<=>: if you say '$c = $v' and the
container of $c has a PMC property "assignment_trapper" with value $t,
then &infix:<=> calls '$t.ASSIGN($v)' instead of the 'setref' opcode.

I spun the weighted-picking logic in Hash off into its own role,
WeightedPick, so Bag could use it. We can reverse this once hashes
permit non-string keys, in which case Bag (along with Set) can be
reimplemented with a hash.

While I was writing a patch, I fixed a bug in Duration.pm that was
exposed when masak++ made the Instant class do Real.
---
 build/Makefile.in       |    4 +-
 docs/ChangeLog          |    2 +
 docs/ROADMAP            |    3 +-
 src/Perl6/Actions.pm    |   12 ++++-
 src/builtins/assign.pir |    8 +++
 src/core/Bag.pm         |   59 +++++++++++++++++++++++++
 src/core/Duration.pm    |    3 -
 src/core/Hash.pm        |  112 ++++++++++++++++++++--------------------------
 src/core/KeyHash.pm     |  104 +++++++++++++++++++++++++++++++++++++++++++
 src/core/Set.pm         |   24 ++++++++--
 t/spectest.data         |    8 +++
 11 files changed, 264 insertions(+), 75 deletions(-)
 create mode 100644 src/core/Bag.pm
 create mode 100644 src/core/KeyHash.pm

diff --git a/build/Makefile.in b/build/Makefile.in
index f86bc9e..5e20f97 100644
--- a/build/Makefile.in
+++ b/build/Makefile.in
@@ -191,7 +191,6 @@ CORE_SOURCES = \
   src/core/Cool-num.pm \
   src/core/Cool-str.pm \
   src/core/List.pm \
-  src/core/Set.pm \
   src/core/Array.pm \
   src/core/Numeric.pm \
   src/core/Real.pm \
@@ -210,6 +209,9 @@ CORE_SOURCES = \
   src/core/Range.pm \
   src/core/EnumMap.pm \
   src/core/Hash.pm \
+  src/core/Set.pm \
+  src/core/Bag.pm \
+  src/core/KeyHash.pm \
   src/core/Enum.pm \
   src/core/IO.pm \
   src/core/IO/ArgFiles.pm \
diff --git a/docs/ChangeLog b/docs/ChangeLog
index 5806310..94eb888 100644
--- a/docs/ChangeLog
+++ b/docs/ChangeLog
@@ -3,6 +3,8 @@
 + Instants and Durations
 + speedup for slurp() and .reverse built-ins
 + various improvements to the Set type
++ add Bag, KeyHash, KeySet, KeyBag, KeyWeight
++ permit setting the implementation type of a %var with 'is' ('my %h is KeyBag')
 + revamp of series operator code, and adaption to new spec
 + implement ...^ up-to-but-excluding-series operator
 + allow :r and :ratchet modifiers on regex quoting constructs
diff --git a/docs/ROADMAP b/docs/ROADMAP
index e6474e1..4913e4f 100644
--- a/docs/ROADMAP
+++ b/docs/ROADMAP
@@ -34,7 +34,7 @@ Ought to have items
 2 **    Rat, BigNum, numification improvements (C, colomon)
 2 ***   temp variables (C)
 2 ****  better longest token matching in regexes (A, D, pmichaud)
-2 ?     other S02 data types -- KeySet, KeyBag (A)
+2 ?     other S02 data types (A)
 2 ?     specialized Unicode bits -- .codes, .graphs, .bytes (A, C)
 
 Nice to have items
@@ -115,3 +115,4 @@ Completed ROADMAP items:
   - true hyper/cross/reverse/other metaoperators
   - synopsis 19 handling
   - other REPL improvements
+  - KeyHash and derivatives
diff --git a/src/Perl6/Actions.pm b/src/Perl6/Actions.pm
index 99f8231..efccf60 100644
--- a/src/Perl6/Actions.pm
+++ b/src/Perl6/Actions.pm
@@ -1049,9 +1049,15 @@ sub declare_variable($/, $past, $sigil, $twigil, $desigilname, $trait_list) {
     else {
         # Not an attribute - need to emit delcaration here.
         # Create the container
-        my $cont := $sigil eq '%' ??
-            PAST::Op.new( :name('&CREATE_HASH_FROM_LOW_LEVEL'), :pasttype('call') ) !!
-            PAST::Op.new( sigiltype($sigil), :pirop('new Ps') );
+        my $cont;
+        if $sigil eq '%' {
+            my $itype := has_compiler_trait($trait_list, '&trait_mod:<is>');
+            $cont := $itype
+               ?? PAST::Op.new(:pasttype('callmethod'), :name('new'), $itype[0])
+               !! PAST::Op.new( :name('&CREATE_HASH_FROM_LOW_LEVEL'), :pasttype('call') );
+        } else {
+            $cont := PAST::Op.new( sigiltype($sigil), :pirop('new Ps') );
+        }
         
         # Give it a 'rw' property unless it's explicitly readonly.
         my $readtype := trait_readtype($trait_list);
diff --git a/src/builtins/assign.pir b/src/builtins/assign.pir
index d462c4c..9963895 100644
--- a/src/builtins/assign.pir
+++ b/src/builtins/assign.pir
@@ -66,6 +66,11 @@ src/builtins/assign.pir - assignment operations
     whence()
 
   scalar_assign:
+    .local pmc trapper
+    trapper = getprop 'assignment_trapper', cont
+    if null trapper goto check_nil_assign
+    goto item_assign
+  check_nil_assign:
     # check for Nil assignment
     $I0 = isa source, ['Nil']
     unless $I0 goto item_assign
@@ -81,6 +86,9 @@ src/builtins/assign.pir - assignment operations
     source = source.'item'()
   have_source:
     source = descalarref source
+    if null trapper goto normal_assignment
+    .tailcall trapper.'ASSIGN'(source)
+  normal_assignment:
     setref cont, source
     .return (cont)
 
diff --git a/src/core/Bag.pm b/src/core/Bag.pm
new file mode 100644
index 0000000..9eff052
--- /dev/null
+++ b/src/core/Bag.pm
@@ -0,0 +1,59 @@
+use v6;
+
+class Bag does Associative does WeightedPick {
+    # We could use a hash here, but right now hash keys coerce to Str,
+    # so instead let's use an array of Pairs for the time being.
+    has @!p;
+
+    multi readonly(@a) { @a }
+    multi readonly($x) { $x }
+
+    multi method new()       { self.bless: * }
+
+    multi method new(@elems) {
+        my @p;
+        for @elems -> $e {
+            if @p.grep(*.key eqv $e)[0] {
+                ++$^p.value;
+            } else {
+                push @p, $e => 1;
+            }
+        }
+        self.bless: *, p => @p;
+    }
+
+    multi method new(*@elems) { self.new: @elems }
+
+    multi method new(Bag $b) { $b }
+
+    multi method new(Set $s) { self.new: $s.keys }
+
+    method !STORE(\$args) {
+        die "Bags are immutable, but you tried to modify one"
+    }
+
+    method at_key($k) { 
+        .key eqv $k and return readonly .value for @!p;
+        0;
+    }
+
+    method keys()     { readonly @!p>>.key }
+    method values()   { readonly @!p>>.value }
+    method pairs()    { readonly @!p }
+    method elems()    { [+] self.values }
+    method exists($k) { ? do self.keys.grep: * eqv $k }
+
+    method Bool()     { ?@!p }
+    method Numeric()  { self.elems }
+    method Str()      { self.perl }
+    method hash()     { @!p.hash }
+    method flat()     { @!p.flat }
+
+    method perl() {
+        'bag(' ~ self.list>>.perl.join(', ') ~ ')';
+    }
+}
+
+our sub bag(*@args) { Bag.new: |@args }
+
+# vim: ft=perl6
diff --git a/src/core/Duration.pm b/src/core/Duration.pm
index 97effe2..3f98ab7 100644
--- a/src/core/Duration.pm
+++ b/src/core/Duration.pm
@@ -30,9 +30,6 @@ our multi sub infix:<+>(Duration $a, Duration $b) {
 our multi sub infix:<->(Duration $a, Real $b) {
     Duration.new: $a.x - $b;
 }
-our multi sub infix:<->(Real $a, Duration $b) {
-    Duration.new: $a - $b.x;
-}
 our multi sub infix:<->(Duration $a, Duration $b) {
     Duration.new: $a.x - $b.x;
 }
diff --git a/src/core/Hash.pm b/src/core/Hash.pm
index b4e135c..1f1661a 100644
--- a/src/core/Hash.pm
+++ b/src/core/Hash.pm
@@ -1,4 +1,52 @@
-role Hash is EnumMap {
+use v6;
+
+role WeightedPick {
+
+    sub weighted-pick(@pairs) {
+        my @weights = [\+] @pairs>>.value;
+        my $value = @weights[*-1].rand;
+        return @pairs[0] if @weights[0] > $value;
+        my ($l, $r) = (0, @weights.end);
+        my $middle = floor ($r + $l) / 2;
+        while $middle > $l {
+            if @weights[$middle] < $value {
+                $l = $middle;
+            }
+            else {
+                 $r = $middle;
+            }
+            $middle = floor ($r + $l) / 2;
+        }
+        @pairs[$r];
+    }
+
+    multi method pick() { weighted-pick(self.pairs).key }
+
+    multi method pick($n is copy) {
+        $n <= 0 and return Nil;
+        my @p = self.pairs>>.clone.grep: *.value;
+        gather {
+            while $n-- && @p {
+                my $pp = weighted-pick @p;
+                take $pp.key;
+                --$pp.value or @p .= grep: {.key !eqv $pp.key};
+            }
+        }
+    }
+
+    multi method pick(Whatever) { self.pick: Inf }
+
+    multi method roll($n is copy = 1) {
+       gather {
+            take self.pick for ^$n;
+       }
+    }
+
+    multi method roll(Whatever) { self.roll: Inf }
+
+}
+
+role Hash is EnumMap does WeightedPick {
     method at_key($key) {
         my $z = Any!butWHENCE(
                     { pir::set__vQsP($!storage, $key, $z); }
@@ -97,68 +145,6 @@ role Hash is EnumMap {
         self.pairs.sort(&by)
     }
 
-    multi method pick($num is copy = 1) {
-        if ($num == 1) {
-            my @weights = [\+] self.values;
-            my $value = @weights[*-1].rand;
-            return self.keys[0] if @weights[0] > $value;
-            my ($l, $r) = (0, @weights.elems-1);
-            my $middle = floor ($r + $l) / 2;
-            while $middle > $l {
-                if @weights[$middle] < $value {
-                    $l = $middle;
-                }
-                else {
-                     $r = $middle;
-                }
-                $middle = floor ($r + $l) / 2;
-            }
-            return self.keys[$r];
-        }
-
-        my %copyHash = @.pairs.grep({ .value != 0});
-        gather {
-            while $num > 0 && %copyHash {
-                take my $picked = %copyHash.pick();
-                unless --%copyHash{$picked} {
-                    %copyHash.delete($picked);
-                }
-                $num--;
-            }
-        }
-    }
-
-    multi method pick(Whatever) {
-        self.pick(Inf);
-    }
-
-    multi method roll($num is copy = 1) {
-        if ($num == 1) {
-            my @weights = [\+] self.values;
-            my $value = @weights[*-1].rand;
-            return self.keys[0] if @weights[0] > $value;
-            my ($l, $r) = (0, @weights.elems-1);
-            my $middle = floor ($r + $l) / 2;
-            while $middle > $l {
-                if @weights[$middle] < $value {
-                    $l = $middle;
-                }
-                else {
-                     $r = $middle;
-                }
-                $middle = floor ($r + $l) / 2;
-            }
-            return self.keys[$r];
-        }
-
-        gather {
-            take self.roll() for ^$num;
-        }
-    }
-
-    multi method roll(Whatever) {
-        self.roll(Inf);
-    }
 }
 
 
diff --git a/src/core/KeyHash.pm b/src/core/KeyHash.pm
new file mode 100644
index 0000000..3374f5d
--- /dev/null
+++ b/src/core/KeyHash.pm
@@ -0,0 +1,104 @@
+use v6;
+
+class KeyHashTrapper {
+
+    has $.kh;
+    has $.key;
+
+    # # This fails with errors that change from run to run—perhaps
+    # # because of a bug in Hash.pm.
+    # method ASSIGN($val) {
+    #     $.kh.okay-val($val)
+    #       ?? ($.kh.Hash::at_key($.key) = $val)
+    #       !! $.kh.Hash::delete($.key);
+    #     $.kh{$.key};
+    # }
+
+    method ASSIGN($val) { Q:PIR {
+        .local pmc self
+        self = find_lex 'self'
+        .local pmc kh
+        kh = getattribute self, '$!kh'
+        .local pmc store
+        store = getattribute kh, '$!storage'
+        .local pmc key
+        key = getattribute self, '$!key'
+        .local pmc val
+        val = find_lex '$val'
+
+        $P0 = kh.'okay-val'(val)
+        if $P0 goto set
+        delete store[key]
+        goto done
+      set:
+        store[key] = val
+      done:
+        %r = kh.'at_key'(key)
+    } }
+
+}
+
+class DefaultDefault {};
+
+role KeyHash[::T = Any, $default = DefaultDefault] does Hash {
+
+    has $!default-value = $default ~~ DefaultDefault
+      ?? do { given T {
+          when Bool       { False }
+          when Num        { 0.Num }
+          when Rat        { 0.Rat }
+          when Numeric    { 0 }
+          when Stringy    { '' }
+          default         { T }
+        } }
+      !! $default;
+
+    method !get($key) { Q:PIR {
+        $P0 = find_lex 'self'
+        $P1 = getattribute $P0, '$!storage'
+        $P0 = find_lex '$key'
+        %r = $P1[$P0]
+    } }
+
+    method at_key($key) {
+        my T $subscript = self.exists($key)
+          ?? self!get: $key
+          !! $!default-value;
+        pir::setprop__vPsP($subscript, 'assignment_trapper',
+            KeyHashTrapper.new: kh => self, :$key);
+        $subscript;
+    }
+
+    method okay-val($v) {
+        $default ~~ DefaultDefault ?? ?$v !! $v !eqv $default
+    }
+
+    method elems() { [+] self.values }    
+
+    method grab(Int $n is copy = 1) {
+        gather {
+            while $n-- && self.keys {
+                take my $picked = self.pick;
+                --self.{$picked} or self.delete: $picked;
+            }
+        }
+    }
+
+    method Numeric() { self.elems }
+
+    method perl() { sprintf 'KeyHash[%s, %s].new(%s)',
+        T.perl,
+        $!default-value.perl,
+        self.pairs.map(*.perl).join(', ')
+    }
+
+}
+
+class KeySet does KeyHash[Bool] { }
+
+class KeyBag does KeyHash[Int] {
+    # Without real UInts, we have to cheat a bit.
+    method okay-val($v) { $v > 0 }
+}
+
+class KeyWeight does KeyHash[Rat] { } # XXX Should be FatRat
diff --git a/src/core/Set.pm b/src/core/Set.pm
index ac3d26e..85e74a0 100644
--- a/src/core/Set.pm
+++ b/src/core/Set.pm
@@ -1,8 +1,13 @@
+use v6;
+
 class Set does Associative {
     # We could use a hash here, but right now hash keys coerce to Str,
     # so instead let's use an array and &uniq for the time being.
     has @!elems;
 
+    multi method new() {
+        self.bless: *;
+    }
     multi method new(@elems) {
         self.bless(self.CREATE, :elems( uniq @elems ));
     }
@@ -16,6 +21,10 @@ class Set does Associative {
         $set;
     }
 
+    method !STORE(\$args) {
+        die 'Sets are immutable, but you tried to modify one'
+    }
+
     sub contains(@array, $value) {
         for @array {
             if $value === $_ {
@@ -25,7 +34,7 @@ class Set does Associative {
         return False;
     }
 
-    method keys() { @!elems }
+    method keys() { { @^readonly-elems }(@!elems) }
     method values() { True xx +@!elems }
     method elems() { +@!elems }
     method exists($elem) { contains(@!elems, $elem) }
@@ -34,9 +43,14 @@ class Set does Associative {
         contains(@!elems, $key);
     }
 
-    method Num() { +self.elems }
     method Bool() { ?self.elems }
-    method hash() { hash @!elems Z=> True xx * }
+    method Numeric() { +self.elems }
+    method Str() { self.perl }
+    method hash() { hash self.flat }
+    method flat() { @!elems Z=> True xx * }
+
+    method pick(*@args) { @!elems.pick: |@args }
+    method roll(*@args) { @!elems.roll: |@args }
 
     multi method union(@otherset) {
         self.new((@!elems, @otherset));
@@ -93,7 +107,7 @@ class Set does Associative {
     }
 
     method perl() {
-        'Set.new(' ~ join(', ', map { .perl }, @!elems) ~ ')';
+        'set(' ~ join(', ', map { .perl }, @!elems) ~ ')';
     }
 }
 
@@ -137,4 +151,6 @@ our multi sub  infix:«(>)»(    %a, %b) { Set.new( %a).superset(%b) }
 our multi sub  infix:«(>)»(    @a, %b) { Set.new(|@a).superset(%b) }
 our multi sub  infix:«(>)»(    @a, @b) { Set.new(|@a).superset(@b) }
 
+our sub set(*@args) { Set.new: |@args }
+
 # vim: ft=perl6
diff --git a/t/spectest.data b/t/spectest.data
index c84a7b0..3bbd41a 100644
--- a/t/spectest.data
+++ b/t/spectest.data
@@ -68,6 +68,7 @@ S02-builtin_data_types/array_extending.t
 S02-builtin_data_types/array_ref.t
 S02-builtin_data_types/array.t
 S02-builtin_data_types/assigning-refs.t
+S02-builtin_data_types/bag.t
 S02-builtin_data_types/bool.t
 S02-builtin_data_types/capture.t
 S02-builtin_data_types/catch_type_cast_mismatch.t
@@ -76,6 +77,11 @@ S02-builtin_data_types/flattening.t
 S02-builtin_data_types/hash_ref.t
 S02-builtin_data_types/hash.t
 S02-builtin_data_types/infinity.t
+S02-builtin_data_types/instants-and-durations.t
+S02-builtin_data_types/keybag.t
+S02-builtin_data_types/keyhash.t
+S02-builtin_data_types/keyset.t
+S02-builtin_data_types/keyweight.t
 S02-builtin_data_types/lists.t
 S02-builtin_data_types/mixed_multi_dimensional.t
 S02-builtin_data_types/multi_dimensional_array.t
@@ -88,6 +94,7 @@ S02-builtin_data_types/pair.t
 S02-builtin_data_types/parcel.t
 S02-builtin_data_types/parsing-bool.t
 S02-builtin_data_types/range.t
+S02-builtin_data_types/set.t
 S02-builtin_data_types/sigils-and-types.t
 S02-builtin_data_types/subscripts_and_context.t
 S02-builtin_data_types/type.t
@@ -596,6 +603,7 @@ S32-str/words.t                                            # icu
 S32-temporal/calendar.t
 S32-temporal/Date.t
 S32-temporal/DateTime.t
+S32-temporal/DateTime-Instant-Duration.t
 S32-trig/e.t
 # S32-trig/pi.t
 S32-trig/sin.t                                             # long
-- 
1.7.0.4

@p6rt
Copy link
Author

p6rt commented Sep 16, 2010

From @pmichaud

On Thu Sep 16 06​:21​:00 2010, openstdout@​gmail.com wrote​:

This patch adds a bunch of container types and lets you do 'my %h is
SomeType'. See the commit message for the details.

Thanks for the contribution. Unfortunately, the patch combines too many
individual features together to be reviewed effectively -- please split
it up into smaller patches (e.g., the patch for Instant/Duration should
be separate).

We shouldn't need to invent a new way of "overloading" infix​:<=> --
simply use the existing one of providing a C<!STORE> method. If that
isn't sufficient, we probably need to figure out and document why.

Also, the C<Hash> role (may be class) is about to receive a significant
refactor, I'd prefer to add in the other hash-related capabilities after
that refactor has been done. Regardless, if there's going to be a
WeightedPick role (the synopses do not mention one), it needs to be in
its own source code file. I suspect that WeightedPick capabilities may
belong in a library and not in the core.

Thanks again, I look forward to reviewing revised patches!

Closing ticket (submit new patches as individual tickets for the
capability provided).

Pm

@p6rt
Copy link
Author

p6rt commented Sep 16, 2010

The RT System itself - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Sep 16, 2010

@pmichaud - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Sep 16, 2010
@p6rt p6rt added the patch label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant