Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implementation of transliteration #125

Closed
p6rt opened this issue Jun 8, 2008 · 26 comments
Closed

implementation of transliteration #125

p6rt opened this issue Jun 8, 2008 · 26 comments
Labels

Comments

@p6rt
Copy link

p6rt commented Jun 8, 2008

Migrated from rt.perl.org#55492 (status was 'resolved')

Searchable as RT55492$

@p6rt
Copy link
Author

p6rt commented Jun 8, 2008

From @cjfields

Attached is a partial PIR-based implementation of trans() for Str.pir
(for transliteration); diff run off r28183. It handles very simple
transliteration only and doesn't deal with strings that are
interpolated as Ranges, so it will need some more significant tweaking
to get everything working. I will add a version of trans.t to pugs
that works with Rakudo; this patch doesn't pass all transliteration
tests yet.

A more fully implemented p6-based version is being developed and will
be added when using p6 code for classes becomes possible.

chris

@p6rt
Copy link
Author

p6rt commented Jun 8, 2008

From @cjfields

trans.diff

@p6rt
Copy link
Author

p6rt commented Jun 9, 2008

From @cjfields

Attaching updated diff file (against r28209) which passes 5/17 tests. Will commit rakudo-
compliant test suite to pugs when I determine (via #perl6) how to go about it w/o breaking the
already-present trans.t in t/spec.

@p6rt
Copy link
Author

p6rt commented Jun 9, 2008

From @cjfields

trans.diff
*** Str.pir	2008-06-08 23:28:40.000000000 -0500
--- Str.pir.back	2008-06-08 13:37:39.000000000 -0500
***************
*** 213,283 ****
      .return(retv)
  .end
  
- =item trans()
- 
- Partial implementation of transliteration.  Handles simple p6-like
- transliteration (see S05):
- 
-   "ABC".trans( ('A'=>'a'), ('B'=>'b'), ('C'=>'c') ) # abc
-   "ABC".trans('XYZ' => 'xyz')
-   "ABCXYZ".trans( (['A'..'C'] => ['a'..'c']), (<X Y Z> => <x y z>)
-   
-   Does not interpolate strings 'A..Z' or 'A-Z' into Ranges ('A'..'Z') yet
-   
- =cut
- 
- .sub 'trans' :method
-     .param pmc args :slurpy
-     .local pmc pair
-     .local pmc table
-     .local pmc tmps
-     .local pmc retv
-     
-     table = new 'Hash'
-     retv = new 'Perl6Str'
-     tmps = self
-     
-   pair_loop:
-     unless args, st_trans
-     pair = shift args
-     .local pmc key, val, key_it, val_it
-     # should handle Array key/value pairs
-     key = pair.'key'()
-     val = pair.'value'()
-     key_it = new 'Iterator', key
-     val_it = new 'Iterator', val
-     
-   mapping:
-     unless key_it, pair_loop
-     $S0 = shift key_it
-     $S1 = shift val_it
-     table[$S0] = $S1
-     goto mapping
-     
-   st_trans:
-     # 'Perl6Str' iterator is not working here for some reason (not implemented)
-     .local int pos, len
-     .local string orig, rep
-     len = tmps.'chars'()
-     pos = 0
-     
-   trans_loop:
-     if pos == len goto done
-     orig = substr tmps, pos, 1
-     rep = table[orig]
-     if rep, next
-     rep = orig
-     
-   next:
-     concat retv, rep
-     pos += 1
-     goto trans_loop
-     
-   done: 
-     .return(retv)
- .end
- 
- =cut
  
  =item perl()
  
--- 213,218 ----

@p6rt
Copy link
Author

p6rt commented Jun 9, 2008

@cjfields - Status changed from 'new' to 'open'

@p6rt
Copy link
Author

p6rt commented Jun 10, 2008

From @cjfields

S05-transliteration/trans.t has been updated in pugs, should pass 5 of 17 tests now with last
patch

@p6rt
Copy link
Author

p6rt commented Jun 21, 2008

From @cjfields

Attached patch passes all tests (except5 tests for tr///, which isn't implemented).

@p6rt
Copy link
Author

p6rt commented Jun 21, 2008

From @cjfields

trans.diff
*** Str.pir.old	2008-06-21 16:59:14.000000000 -0500
--- Str.pir	2008-06-21 17:55:45.000000000 -0500
***************
*** 213,218 ****
--- 213,339 ----
      .return(retv)
  .end
  
+ =item trans()
+ 
+ Partial implementation of transliteration.  Handles simple p6-like
+ transliteration (see S05):
+ 
+   "ABC".trans( ('A'=>'a'), ('B'=>'b'), ('C'=>'c') ) # abc
+   "ABC".trans('ABC' => 'abc')
+   "ABCXYZ".trans( (['A'..'C'] => ['a'..'c']), (<X Y Z> => <x y z>) )
+   
+   Does not interpret strings 'A..Z' or 'A-Z' into Ranges ('A'..'Z') yet
+   
+ =cut
+ 
+ .sub '!transmap' :multi(_)
+     .param pmc range
+     .local pmc retval, it
+     retval = new 'ResizablePMCArray'
+   range_loop:
+     unless range, done
+     $S0 = range.'shift'()
+     push retval, $S0
+     goto range_loop
+   done:
+     .return(retval)
+ .end
+ 
+ .sub '!transmap' :multi('Perl6Str')
+     .param pmc str
+     .local pmc retval, prior, next, rclass, range, list, frm, to
+     .local int end, pos
+     retval = new 'ResizablePMCArray'
+     prior = new 'ResizablePMCArray'    
+     prior = str.'split'('..')
+     end = prior
+     dec end
+     pos = 0 
+   process_prior:
+     unless prior, done
+     # set is a String, not a Perl6Str
+     $S0 = shift prior
+     next = new 'ResizablePMCArray'
+     next = split '', $S0
+     if pos == 0 goto process_retval
+     # set up Perl6 Range here; when .HLL mapping works for Str may refactor
+   get_from:
+     $S1 = pop retval
+     if $S1 == ' ' goto get_from
+     frm = new 'Perl6Str'
+     frm = $S1
+   get_to:
+     $S1 = shift next
+     if $S1 == ' ' goto get_to
+     to = new 'Perl6Str'
+     to = $S1
+   set_range:
+     rclass = get_hll_global 'Range'
+     range = rclass.'new'('from'=>frm, 'to'=>to)
+   process_list:
+     unless range, process_retval
+     $S2 = range.'shift'()
+     push retval, $S2
+     goto process_list
+   process_retval:
+     unless next, process_prior
+     $S3 = shift next
+     push retval, $S3
+     inc pos
+     goto process_retval
+   done:
+     .return(retval)
+ .end
+ 
+ .sub 'trans' :method
+     .param pmc args :slurpy
+     .local pmc pair
+     .local pmc table
+     .local pmc tmps
+     .local pmc retv
+     table = new 'Hash'
+     retv = new 'Perl6Str'
+     tmps = self
+   pair_loop:
+     unless args, st_trans
+     pair = shift args
+     .local pmc key, val, keymap, valmap
+     .local string lv
+     key = pair.'key'()
+     val = pair.'value'()
+     keymap = '!transmap'(key)
+     valmap = '!transmap'(val)
+   mapping:
+     unless keymap, pair_loop
+     # reconfigure to let val 'overflow'
+     $S0 = shift keymap
+     unless valmap, get_prev
+     lv = shift valmap
+   get_prev:
+     $S1 = lv
+     table[$S0] = $S1
+     goto mapping
+   st_trans:
+     # 'Perl6Str' iterator is not working here for some reason (not implemented)
+     .local int pos, len
+     .local string orig, rep
+     len = tmps.'chars'()
+     pos = 0
+   trans_loop:
+     if pos == len goto done
+     orig = substr tmps, pos, 1
+     rep = table[orig]
+     if rep, next
+     rep = orig
+   next:
+     concat retv, rep
+     pos += 1
+     goto trans_loop
+   done: 
+     .return(retv)
+ .end
+ 
+ =cut
  
  =item perl()
  

@p6rt
Copy link
Author

p6rt commented Jun 21, 2008

From @cjfields

On Sat Jun 21 16​:14​:31 2008, cjfields wrote​:

Attached patch passes all tests (except5 tests for tr///, which isn't
implemented).

D'oh! POD needs to be updated to reflect strings are interpreted to look for ranges.

@p6rt
Copy link
Author

p6rt commented Jun 23, 2008

From @cjfields

On Sat Jun 21 16​:33​:22 2008, cjfields wrote​:

On Sat Jun 21 16​:14​:31 2008, cjfields wrote​:

Attached patch passes all tests (except5 tests for tr///, which
isn't
implemented).

Marking as stalled until I can get a Str.pir patch for handling one-to-many and many-to-one
versions of trans(), where key/value is Array. It will require some backtracking for at least the
many-to-one case to get longest leftmost match.

@p6rt
Copy link
Author

p6rt commented Jun 23, 2008

@cjfields - Status changed from 'open' to 'stalled'

@p6rt
Copy link
Author

p6rt commented Jun 26, 2008

From @cjfields

On Mon Jun 23 09​:14​:58 2008, cjfields wrote​:

On Sat Jun 21 16​:33​:22 2008, cjfields wrote​:

On Sat Jun 21 16​:14​:31 2008, cjfields wrote​:

Attached patch passes all tests (except5 tests for tr///, which
isn't
implemented).

Marking as stalled until I can get a Str.pir patch for handling one-
to-many and many-to-one
versions of trans(), where key/value is Array. It will require some
backtracking for at least the
many-to-one case to get longest leftmost match.

New patch added, passes all tests in S05-transliteration/trans.t. I will add new spec tests for
options :s (squash) and :d (delete). I'll try to add in :c in a separate patch (will require some
more work and discussion on #perl6/#parrot).

@p6rt
Copy link
Author

p6rt commented Jun 26, 2008

From @cjfields

trans.diff
*** Str.pir.old	2008-06-26 16:27:16.000000000 -0500
--- Str.pir	2008-06-26 16:27:05.000000000 -0500
***************
*** 216,221 ****
--- 216,377 ----
      .return(retv)
  .end
  
+ =item trans()
+ 
+   Implementation of transliteration.
+   
+   TODO: Does not handle :c (complement) yet
+ 
+ =cut
+ 
+ .sub '!transtable' :multi(_)
+     .param pmc range
+     .local pmc retval, it
+     retval = new 'ResizableStringArray'
+   range_loop:
+     unless range, done
+     $S0 = range.'shift'()
+     push retval, $S0
+     goto range_loop
+   done:
+     .return(retval)
+ .end
+ 
+ .sub '!transtable' :multi('Perl6Str')
+     .param pmc str
+     .local pmc retval, prior, next, rclass, range, list, frm, to
+     .local int pos
+     retval = new 'ResizableStringArray'
+     prior = new 'ResizableStringArray'    
+     prior = str.'split'('..')
+     pos = 0 
+   process_prior:
+     unless prior, done
+     # set is a String, not a Perl6Str
+     $S0 = shift prior
+     next = new 'ResizableStringArray'
+     next = split '', $S0
+     if pos == 0 goto process_retval
+     # set up Perl6 Range here; when .HLL mapping works for Str may refactor
+   get_from:
+     $S1 = pop retval
+     if $S1 == ' ' goto get_from
+     frm = new 'Perl6Str'
+     frm = $S1
+   get_to:
+     $S1 = shift next
+     if $S1 == ' ' goto get_to
+     to = new 'Perl6Str'
+     to = $S1
+   set_range:
+     # convert to p6 Range
+     rclass = get_hll_global 'Range'
+     range = rclass.'new'('from'=>frm, 'to'=>to)
+   process_list:
+     unless range, process_retval
+     $S2 = range.'shift'()
+     push retval, $S2
+     goto process_list
+   process_retval:
+     unless next, process_prior
+     $S3 = shift next
+     push retval, $S3
+     inc pos
+     goto process_retval
+   done:
+     .return(retval)
+ .end
+ 
+ .sub 'trans' :method
+     .param pmc args :slurpy
+     .param int del :named('d') :optional
+     .param int squash :named('s') :optional
+     .local pmc pair
+     .local pmc table
+     .local pmc itable
+     .local pmc pkey
+     .local pmc pval
+     .local pmc retv
+     .local string tmps, k, v, lv, lastmatch
+     .local int len, klen, vlen, adjpos, pos, ind, nhits
+     table = new 'Hash'
+     itable = new 'Hash'
+     retv = new 'Perl6Str'
+     tmps = self
+     lv = ''
+   pair_loop:
+     unless args, st_trans
+     pair = shift args
+     pkey = pair.'key'()
+     pval = pair.'value'()
+     pkey = '!transtable'(pkey)
+     pval = '!transtable'(pval)
+     vlen = elements pval
+     if vlen goto mapping
+     if del goto mapping
+     pval = clone pkey
+   mapping:
+     unless pkey, pair_loop
+     k = shift pkey
+     unless pval, get_prev1
+     lv = shift pval
+   get_prev1:
+     unless del, get_prev2
+     v = ''
+   get_prev2:
+     v = lv
+     nhits = 0
+     pos = 0
+   index_loop:
+     ind = index tmps, k, pos
+     if ind == -1 goto check_elems
+     inc nhits
+     $S0 = itable[ind]
+     unless $S0, new_hit
+     $I0 = length $S0
+     $I1 = length k
+     if $I1 < $I0 goto next_hit
+   new_hit:
+     itable[ind] = k
+   next_hit:
+     pos = ind + 1
+     goto index_loop
+   check_elems:
+     unless nhits, mapping
+     table[k] = v
+     goto mapping
+   st_trans:
+     len = length tmps
+     pos = 0
+     adjpos = 0
+     lastmatch = ''
+   table_loop:
+     if pos == len goto done
+     k = itable[pos]
+     unless k, next_pos2
+     v = table[k]
+     unless squash, replace
+     if k != lastmatch goto replace
+   del_v:
+     v = ''
+   replace:
+     vlen = length v
+     klen = length k
+     say v
+     substr tmps, adjpos, klen, v
+   next_pos1:
+     lastmatch = k  
+     adjpos += vlen
+     pos += klen
+     goto table_loop
+   next_pos2:
+     inc pos
+     inc adjpos
+     goto table_loop
+   done:
+     retv = tmps
+     .return(retv)
+ .end
  
  =item perl()
  

@p6rt
Copy link
Author

p6rt commented Jun 26, 2008

The RT System itself - Status changed from 'stalled' to 'open'

@p6rt
Copy link
Author

p6rt commented Jun 26, 2008

From @cjfields

My bad; I left an extraneous 'say' debugging line in the last patch; attaching new one. Surprised
RT doesn't allow you to replace an old patch...

@p6rt
Copy link
Author

p6rt commented Jun 26, 2008

From @cjfields

trans.diff
*** Str.pir.old	2008-06-26 16:27:16.000000000 -0500
--- Str.pir	2008-06-26 16:41:45.000000000 -0500
***************
*** 216,221 ****
--- 216,376 ----
      .return(retv)
  .end
  
+ =item trans()
+ 
+   Implementation of transliteration.
+   
+   TODO: Does not handle :c (complement) yet
+ 
+ =cut
+ 
+ .sub '!transtable' :multi(_)
+     .param pmc range
+     .local pmc retval, it
+     retval = new 'ResizableStringArray'
+   range_loop:
+     unless range, done
+     $S0 = range.'shift'()
+     push retval, $S0
+     goto range_loop
+   done:
+     .return(retval)
+ .end
+ 
+ .sub '!transtable' :multi('Perl6Str')
+     .param pmc str
+     .local pmc retval, prior, next, rclass, range, list, frm, to
+     .local int pos
+     retval = new 'ResizableStringArray'
+     prior = new 'ResizableStringArray'    
+     prior = str.'split'('..')
+     pos = 0 
+   process_prior:
+     unless prior, done
+     # set is a String, not a Perl6Str
+     $S0 = shift prior
+     next = new 'ResizableStringArray'
+     next = split '', $S0
+     if pos == 0 goto process_retval
+     # set up Perl6 Range here; when .HLL mapping works for Str may refactor
+   get_from:
+     $S1 = pop retval
+     if $S1 == ' ' goto get_from
+     frm = new 'Perl6Str'
+     frm = $S1
+   get_to:
+     $S1 = shift next
+     if $S1 == ' ' goto get_to
+     to = new 'Perl6Str'
+     to = $S1
+   set_range:
+     # convert to p6 Range
+     rclass = get_hll_global 'Range'
+     range = rclass.'new'('from'=>frm, 'to'=>to)
+   process_list:
+     unless range, process_retval
+     $S2 = range.'shift'()
+     push retval, $S2
+     goto process_list
+   process_retval:
+     unless next, process_prior
+     $S3 = shift next
+     push retval, $S3
+     inc pos
+     goto process_retval
+   done:
+     .return(retval)
+ .end
+ 
+ .sub 'trans' :method
+     .param pmc args :slurpy
+     .param int del :named('d') :optional
+     .param int squash :named('s') :optional
+     .local pmc pair
+     .local pmc table
+     .local pmc itable
+     .local pmc pkey
+     .local pmc pval
+     .local pmc retv
+     .local string tmps, k, v, lv, lastmatch
+     .local int len, klen, vlen, adjpos, pos, ind, nhits
+     table = new 'Hash'
+     itable = new 'Hash'
+     retv = new 'Perl6Str'
+     tmps = self
+     lv = ''
+   pair_loop:
+     unless args, st_trans
+     pair = shift args
+     pkey = pair.'key'()
+     pval = pair.'value'()
+     pkey = '!transtable'(pkey)
+     pval = '!transtable'(pval)
+     vlen = elements pval
+     if vlen goto mapping
+     if del goto mapping
+     pval = clone pkey
+   mapping:
+     unless pkey, pair_loop
+     k = shift pkey
+     unless pval, get_prev1
+     lv = shift pval
+   get_prev1:
+     unless del, get_prev2
+     v = ''
+   get_prev2:
+     v = lv
+     nhits = 0
+     pos = 0
+   index_loop:
+     ind = index tmps, k, pos
+     if ind == -1 goto check_elems
+     inc nhits
+     $S0 = itable[ind]
+     unless $S0, new_hit
+     $I0 = length $S0
+     $I1 = length k
+     if $I1 < $I0 goto next_hit
+   new_hit:
+     itable[ind] = k
+   next_hit:
+     pos = ind + 1
+     goto index_loop
+   check_elems:
+     unless nhits, mapping
+     table[k] = v
+     goto mapping
+   st_trans:
+     len = length tmps
+     pos = 0
+     adjpos = 0
+     lastmatch = ''
+   table_loop:
+     if pos == len goto done
+     k = itable[pos]
+     unless k, next_pos2
+     v = table[k]
+     unless squash, replace
+     if k != lastmatch goto replace
+   del_v:
+     v = ''
+   replace:
+     vlen = length v
+     klen = length k
+     substr tmps, adjpos, klen, v
+   next_pos1:
+     lastmatch = k  
+     adjpos += vlen
+     pos += klen
+     goto table_loop
+   next_pos2:
+     inc pos
+     inc adjpos
+     goto table_loop
+   done:
+     retv = tmps
+     .return(retv)
+ .end
  
  =item perl()
  

@p6rt
Copy link
Author

p6rt commented Jul 6, 2008

From @cjfields

This should be the final (fully implemented) version of trans(). Complement, squash, and delete
options are implemented (named args :c, :s, :d); it uses longest leftmost match when needed. Also added relevant tests to t/spec/S05-transliteration/trans.t.

Not sure whether this belongs in Str or Any, so I placed in in Str.pir for now. It's possible some
things here will need to be changed as the the spec changes, but this seems to match what is
described in S05 and is a start.

@p6rt
Copy link
Author

p6rt commented Jul 6, 2008

From @cjfields

trans.diff
*** Str.pir.old	2008-07-05 20:25:35.000000000 -0500
--- Str.pir	2008-07-05 20:41:08.000000000 -0500
***************
*** 206,211 ****
--- 206,432 ----
  .end
  
  
+ =item trans()
+ 
+   Method implementation of transliteration
+   
+ =cut
+ 
+ .sub '!transtable' :multi(_)
+     .param pmc r
+     .local pmc retval
+     retval = new 'ResizableStringArray'
+   range_loop:
+     unless r, done
+     $S0 = r.'shift'()
+     push retval, $S0
+     goto range_loop
+   done:
+     .return(retval)
+ .end
+ 
+ .sub '!transtable' :multi('Perl6Str')
+     .param string str
+     .local pmc retval, prior, frm, to
+     .local int start, end, len, ind, skipped, r_start, r_end, s_len
+     .local string p
+     retval = new 'ResizableStringArray'
+     prior = new 'ResizableStringArray'
+     start = 0
+     skipped = 0
+     len = length str
+     end = len - 2
+   next_index:
+     ind = index str, '..' , start
+     if ind == -1 goto last_string
+     # ranges can only be after first position, before last one
+     if ind == 0 goto skip_pos
+     if ind >= end goto last_string    
+   init_range:
+     r_start = ind - 1
+     r_end = ind + 2
+   range_frm:
+     $S0 = substr str, r_start, 1
+     $I0 = ord $S0
+     # following code may be commented dep. on how we interpret
+     # spaces within ranges, like 'a .. b'
+     unless $S0 == ' ' goto range_to
+     r_start -= 1
+     if r_start < 0 goto illegal_range  
+     goto range_frm
+   range_to:
+     $S1 = substr str, r_end, 1
+     $I1 = ord $S1
+     # following code may be commented dep. on how we interpret
+     # spaces within ranges, like 'a .. b'
+     unless $S1 == ' ' goto prev_string
+     r_end += 1
+     if r_end == len goto illegal_range
+     goto range_to
+   prev_string:
+     s_len = r_start - start
+     s_len += skipped
+     unless s_len, start_range
+     p = substr str, start, s_len
+     prior = split '', p
+   process_pstring:
+     unless prior, start_range
+     $S2 = shift prior
+     push retval, $S2
+     goto process_pstring
+   start_range:
+     if $I0 > $I1 goto illegal_range
+   make_range:
+     # Here we're assuming the ordinal increments correctly for all chars.
+     # This is a bit naive for now, it definitely needs some unicode testing.
+     # If needed we can switch this over to use a true string Range
+     if $I0 > $I1 goto next_loop
+     $S2 = chr $I0
+     push retval, $S2
+     inc $I0
+     goto make_range
+   illegal_range:
+     die "Illegal range used in transliteration operator"
+   next_loop:
+     start = r_end + 1
+     goto next_index
+   skip_pos: 
+     inc start
+     inc skipped
+     goto next_index
+   last_string:
+     s_len = len - start
+     if s_len <= 0 goto done
+     p = substr str, start, s_len
+     prior = split '', p
+   process_lstring:
+     unless prior, done
+     $S0 = shift prior
+     push retval, $S0
+     goto process_lstring    
+   done:
+     .return(retval)
+ .end
+ 
+ .sub 'trans' :method
+     .param pmc args :slurpy
+     .param int del :named('d') :optional
+     .param int comp :named('c') :optional
+     .param int squash :named('s') :optional
+     .local pmc pair
+     .local pmc table
+     .local pmc itable
+     .local pmc pkey
+     .local pmc pval
+     .local pmc retv
+     .local string tmps, k, v, lv, lastmatch, comp_match
+     .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+     table = new 'Hash'
+     itable = new 'Hash' # should used ordered hash here
+     retv = new 'Perl6Str'
+     tmps = self
+     lv = ''
+     
+   pair_loop:
+     unless args, st_trans
+     pair = shift args
+     isa_pair = isa pair
+     unless isa_pair, pair_exception
+     pkey = pair.'key'()
+     pval = pair.'value'()
+     pkey = '!transtable'(pkey)
+     pval = '!transtable'(pval)
+     vlen = elements pval
+     if vlen goto comp_check
+     if del goto comp_check
+     pval = clone pkey
+   comp_check:
+     # for :c, I am using first element for replacing for now.  I can't find
+     # many examples where this is used otherwise
+     comp_match = pval[0]
+     
+   mapping:
+     unless pkey, pair_loop
+     k = shift pkey
+     unless pval, get_prev1
+     lv = shift pval
+   get_prev1:
+     unless del, get_prev2
+     v = ''
+   get_prev2:
+     v = lv
+     nhits = 0
+     pos = 0
+   index_loop:
+     ind = index tmps, k, pos
+     if ind == -1 goto check_elems
+     inc nhits
+     $S0 = itable[ind]
+     unless $S0, new_hit
+     # keep longest hit at that index
+     $I0 = length $S0
+     $I1 = length k
+     if $I1 < $I0 goto next_hit
+   new_hit:
+     itable[ind] = k
+   next_hit:
+     pos = ind + 1
+     goto index_loop
+   check_elems:
+     unless nhits, mapping
+     table[k] = v
+     goto mapping
+     
+   st_trans:
+     len = length tmps
+     pos = 0
+     adjpos = 0
+     v = ''
+     k = ''
+     lastmatch = ''
+     
+   table_loop:
+     if pos >= len goto done
+     k = itable[pos]
+     if comp goto complement
+   normal:
+     unless k, skip_pos
+     v = table[k]
+     klen = length k
+     goto check_squash
+   complement:
+     # may need to change dep. on how we want :c to work
+     if k, skip_pos
+     v = comp_match
+     klen = 1
+   check_squash:
+     unless squash, replace
+     unless v == lastmatch goto replace
+     vlen = 0
+     substr tmps, adjpos, klen, ''
+     goto next_pos
+   replace:
+     vlen = length v
+     substr tmps, adjpos, klen, v
+   next_pos:
+     pos += klen
+     adjpos += vlen
+     lastmatch = v
+     goto table_loop
+   skip_pos:
+     inc pos
+     inc adjpos
+     lastmatch = ''
+     goto table_loop
+     
+   done:
+     retv = tmps
+     .return(retv)
+     
+   pair_exception:
+     die "Must pass a List of Pairs for transliteration"
+ .end
+ 
  =item perl()
  
  Returns a Perl representation of the Str.

@p6rt
Copy link
Author

p6rt commented Jul 6, 2008

From @cjfields

Had a type check which failed in the last patch (bad op). This is the corrected patch.

@p6rt
Copy link
Author

p6rt commented Jul 6, 2008

From @cjfields

trans.diff
*** Str.pir.old	2008-07-05 20:25:35.000000000 -0500
--- Str.pir	2008-07-05 23:13:12.000000000 -0500
***************
*** 205,210 ****
--- 205,436 ----
      .return(retv)
  .end
  
+ =item match()
+ 
+   Method implementation of match (m//)
+ 
+ =cut
+ 
+ =item trans()
+ 
+   Method implementation of transliteration
+   
+ =cut
+ 
+ .sub '!transtable' :multi(_)
+     .param pmc r
+     .local pmc retval
+     retval = new 'ResizableStringArray'
+   range_loop:
+     unless r, done
+     $S0 = r.'shift'()
+     push retval, $S0
+     goto range_loop
+   done:
+     .return(retval)
+ .end
+ 
+ .sub '!transtable' :multi('Perl6Str')
+     .param string str
+     .local pmc retval, prior, frm, to
+     .local int start, end, len, ind, skipped, r_start, r_end, s_len
+     .local string p
+     retval = new 'ResizableStringArray'
+     prior = new 'ResizableStringArray'
+     start = 0
+     skipped = 0
+     len = length str
+     end = len - 2
+   next_index:
+     ind = index str, '..' , start
+     if ind == -1 goto last_string
+     # ranges can only be after first position, before last one
+     if ind == 0 goto skip_pos
+     if ind >= end goto last_string    
+   init_range:
+     r_start = ind - 1
+     r_end = ind + 2
+   range_frm:
+     $S0 = substr str, r_start, 1
+     $I0 = ord $S0
+     # following code may be commented dep. on how we interpret
+     # spaces within ranges, like 'a .. b'
+     unless $S0 == ' ' goto range_to
+     r_start -= 1
+     if r_start < 0 goto illegal_range  
+     goto range_frm
+   range_to:
+     $S1 = substr str, r_end, 1
+     $I1 = ord $S1
+     # following code may be commented dep. on how we interpret
+     # spaces within ranges, like 'a .. b'
+     unless $S1 == ' ' goto prev_string
+     r_end += 1
+     if r_end == len goto illegal_range
+     goto range_to
+   prev_string:
+     s_len = r_start - start
+     s_len += skipped
+     unless s_len, start_range
+     p = substr str, start, s_len
+     prior = split '', p
+   process_pstring:
+     unless prior, start_range
+     $S2 = shift prior
+     push retval, $S2
+     goto process_pstring
+   start_range:
+     if $I0 > $I1 goto illegal_range
+   make_range:
+     # Here we're assuming the ordinal increments correctly for all chars.
+     # This is a bit naive for now, it definitely needs some unicode testing.
+     # If needed we can switch this over to use a true string Range
+     if $I0 > $I1 goto next_loop
+     $S2 = chr $I0
+     push retval, $S2
+     inc $I0
+     goto make_range
+   illegal_range:
+     die "Illegal range used in transliteration operator"
+   next_loop:
+     start = r_end + 1
+     goto next_index
+   skip_pos: 
+     inc start
+     inc skipped
+     goto next_index
+   last_string:
+     s_len = len - start
+     if s_len <= 0 goto done
+     p = substr str, start, s_len
+     prior = split '', p
+   process_lstring:
+     unless prior, done
+     $S0 = shift prior
+     push retval, $S0
+     goto process_lstring    
+   done:
+     .return(retval)
+ .end
+ 
+ .sub 'trans' :method
+     .param pmc args :slurpy
+     .param int del :named('d') :optional
+     .param int comp :named('c') :optional
+     .param int squash :named('s') :optional
+     .local pmc pair
+     .local pmc table
+     .local pmc itable
+     .local pmc pkey
+     .local pmc pval
+     .local pmc retv
+     .local string tmps, k, v, lv, lastmatch, comp_match
+     .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+     table = new 'Hash'
+     itable = new 'Hash' # should used ordered hash here
+     retv = new 'Perl6Str'
+     tmps = self
+     lv = ''
+     
+   pair_loop:
+     unless args, st_trans
+     pair = shift args
+     isa_pair = isa pair, 'Perl6Pair'
+     unless isa_pair, pair_exception
+     pkey = pair.'key'()
+     pval = pair.'value'()
+     pkey = '!transtable'(pkey)
+     pval = '!transtable'(pval)
+     vlen = elements pval
+     if vlen goto comp_check
+     if del goto comp_check
+     pval = clone pkey
+   comp_check:
+     # for :c, I am using first element for replacing for now.  I can't find
+     # many examples where this is used otherwise
+     comp_match = pval[0]
+     
+   mapping:
+     unless pkey, pair_loop
+     k = shift pkey
+     unless pval, get_prev1
+     lv = shift pval
+   get_prev1:
+     unless del, get_prev2
+     v = ''
+   get_prev2:
+     v = lv
+     nhits = 0
+     pos = 0
+   index_loop:
+     ind = index tmps, k, pos
+     if ind == -1 goto check_elems
+     inc nhits
+     $S0 = itable[ind]
+     unless $S0, new_hit
+     # keep longest hit at that index
+     $I0 = length $S0
+     $I1 = length k
+     if $I1 < $I0 goto next_hit
+   new_hit:
+     itable[ind] = k
+   next_hit:
+     pos = ind + 1
+     goto index_loop
+   check_elems:
+     unless nhits, mapping
+     table[k] = v
+     goto mapping
+     
+   st_trans:
+     len = length tmps
+     pos = 0
+     adjpos = 0
+     v = ''
+     k = ''
+     lastmatch = ''
+     
+   table_loop:
+     if pos >= len goto done
+     k = itable[pos]
+     if comp goto complement
+   normal:
+     unless k, skip_pos
+     v = table[k]
+     klen = length k
+     goto check_squash
+   complement:
+     # may need to change dep. on how we want :c to work
+     if k, skip_pos
+     v = comp_match
+     klen = 1
+   check_squash:
+     unless squash, replace
+     unless v == lastmatch goto replace
+     vlen = 0
+     substr tmps, adjpos, klen, ''
+     goto next_pos
+   replace:
+     vlen = length v
+     substr tmps, adjpos, klen, v
+   next_pos:
+     pos += klen
+     adjpos += vlen
+     lastmatch = v
+     goto table_loop
+   skip_pos:
+     inc pos
+     inc adjpos
+     lastmatch = ''
+     goto table_loop
+     
+   done:
+     retv = tmps
+     .return(retv)
+     
+   pair_exception:
+     die "Must pass a List of Pairs for transliteration"
+ .end
  
  =item perl()
  

@p6rt
Copy link
Author

p6rt commented Jul 6, 2008

From @cjfields

Code moved to builtins/any-str.pir. New patch attached.

@p6rt
Copy link
Author

p6rt commented Jul 6, 2008

From @cjfields

trans.diff
*** any-str.pir.old	2008-07-06 15:42:32.000000000 -0500
--- any-str.pir	2008-07-06 15:38:10.000000000 -0500
***************
*** 75,85 ****
      .return ($P0)
  .end
  
  
! =back
! 
  =cut
  
  # Local Variables:
  #   mode: pir
  #   fill-column: 100
--- 75,301 ----
      .return ($P0)
  .end
  
+ =item trans()
  
!   Implementation of transliteration
!   
  =cut
  
+ .sub '!transtable' :multi(_)
+     .param pmc r
+     .local pmc retval
+     retval = new 'ResizableStringArray'
+   range_loop:
+     unless r, done
+     $S0 = r.'shift'()
+     push retval, $S0
+     goto range_loop
+   done:
+     .return(retval)
+ .end
+ 
+ .sub '!transtable' :multi('Perl6Str')
+     .param string str
+     .local pmc retval, prior, frm, to
+     .local int start, end, len, ind, skipped, r_start, r_end, s_len
+     .local string p
+     retval = new 'ResizableStringArray'
+     prior = new 'ResizableStringArray'
+     start = 0
+     skipped = 0
+     len = length str
+     end = len - 2
+   next_index:
+     ind = index str, '..' , start
+     if ind == -1 goto last_string
+     # ranges can only be after first position, before last one
+     if ind == 0 goto skip_pos
+     if ind >= end goto last_string    
+   init_range:
+     r_start = ind - 1
+     r_end = ind + 2
+   range_frm:
+     $S0 = substr str, r_start, 1
+     $I0 = ord $S0
+     # following code may be commented dep. on how we interpret
+     # spaces within ranges, like 'a .. b'
+     unless $S0 == ' ' goto range_to
+     r_start -= 1
+     if r_start < 0 goto illegal_range  
+     goto range_frm
+   range_to:
+     $S1 = substr str, r_end, 1
+     $I1 = ord $S1
+     # following code may be commented dep. on how we interpret
+     # spaces within ranges, like 'a .. b'
+     unless $S1 == ' ' goto prev_string
+     r_end += 1
+     if r_end == len goto illegal_range
+     goto range_to
+   prev_string:
+     s_len = r_start - start
+     s_len += skipped
+     unless s_len, start_range
+     p = substr str, start, s_len
+     prior = split '', p
+   process_pstring:
+     unless prior, start_range
+     $S2 = shift prior
+     push retval, $S2
+     goto process_pstring
+   start_range:
+     if $I0 > $I1 goto illegal_range
+   make_range:
+     # Here we're assuming the ordinal increments correctly for all chars.
+     # This is a bit naive for now, it definitely needs some unicode testing.
+     # If needed we can switch this over to use a true string Range
+     if $I0 > $I1 goto next_loop
+     $S2 = chr $I0
+     push retval, $S2
+     inc $I0
+     goto make_range
+   illegal_range:
+     die "Illegal range used in transliteration operator"
+   next_loop:
+     start = r_end + 1
+     goto next_index
+   skip_pos: 
+     inc start
+     inc skipped
+     goto next_index
+   last_string:
+     s_len = len - start
+     if s_len <= 0 goto done
+     p = substr str, start, s_len
+     prior = split '', p
+   process_lstring:
+     unless prior, done
+     $S0 = shift prior
+     push retval, $S0
+     goto process_lstring    
+   done:
+     .return(retval)
+ .end
+ 
+ .sub 'trans' :method
+     .param pmc args :slurpy
+     .param int del :named('d') :optional
+     .param int comp :named('c') :optional
+     .param int squash :named('s') :optional
+     .local pmc pair
+     .local pmc table
+     .local pmc itable
+     .local pmc pkey
+     .local pmc pval
+     .local pmc retv
+     .local string tmps, k, v, lv, lastmatch, comp_match
+     .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+     table = new 'Hash'
+     itable = new 'Hash' # should used ordered hash here
+     retv = new 'Perl6Str'
+     tmps = self
+     lv = ''
+     
+   pair_loop:
+     unless args, st_trans
+     pair = shift args
+     isa_pair = isa pair, 'Perl6Pair'
+     unless isa_pair, pair_exception
+     pkey = pair.'key'()
+     pval = pair.'value'()
+     pkey = '!transtable'(pkey)
+     pval = '!transtable'(pval)
+     vlen = elements pval
+     if vlen goto comp_check
+     if del goto comp_check
+     pval = clone pkey
+   comp_check:
+     # for :c, I am using first element for replacing for now.  I can't find
+     # many examples where this is used otherwise
+     comp_match = pval[0]
+     
+   mapping:
+     unless pkey, pair_loop
+     k = shift pkey
+     unless pval, get_prev1
+     lv = shift pval
+   get_prev1:
+     unless del, get_prev2
+     v = ''
+   get_prev2:
+     v = lv
+     nhits = 0
+     pos = 0
+   index_loop:
+     ind = index tmps, k, pos
+     if ind == -1 goto check_elems
+     inc nhits
+     $S0 = itable[ind]
+     unless $S0, new_hit
+     # keep longest hit at that index
+     $I0 = length $S0
+     $I1 = length k
+     if $I1 < $I0 goto next_hit
+   new_hit:
+     itable[ind] = k
+   next_hit:
+     pos = ind + 1
+     goto index_loop
+   check_elems:
+     unless nhits, mapping
+     table[k] = v
+     goto mapping
+     
+   st_trans:
+     len = length tmps
+     pos = 0
+     adjpos = 0
+     v = ''
+     k = ''
+     lastmatch = ''
+     
+   table_loop:
+     if pos >= len goto done
+     k = itable[pos]
+     if comp goto complement
+   normal:
+     unless k, skip_pos
+     v = table[k]
+     klen = length k
+     goto check_squash
+   complement:
+     # may need to change dep. on how we want :c to work
+     if k, skip_pos
+     v = comp_match
+     klen = 1
+   check_squash:
+     unless squash, replace
+     unless v == lastmatch goto replace
+     vlen = 0
+     substr tmps, adjpos, klen, ''
+     goto next_pos
+   replace:
+     vlen = length v
+     substr tmps, adjpos, klen, v
+   next_pos:
+     pos += klen
+     adjpos += vlen
+     lastmatch = v
+     goto table_loop
+   skip_pos:
+     inc pos
+     inc adjpos
+     lastmatch = ''
+     goto table_loop
+     
+   done:
+     retv = tmps
+     .return(retv)
+     
+   pair_exception:
+     die "Must pass a List of Pairs for transliteration"
+ .end
+ 
  # Local Variables:
  #   mode: pir
  #   fill-column: 100

@p6rt
Copy link
Author

p6rt commented Jul 8, 2008

From @cjfields

On Sun Jul 06 14​:18​:57 2008, cjfields wrote​:

Code moved to builtins/any-str.pir. New patch attached.

Latest patch works with :c, :s using a many-to-many transliteration using arrays, as described
in S05. Patch conforms to submissions.pod (sorry about all of the additional ones above).

@p6rt
Copy link
Author

p6rt commented Jul 8, 2008

From @cjfields

trans.diff
--- languages/perl6/src/builtins/any-str.pir.old	2008-07-06 15:42:32.000000000 -0500
+++ languages/perl6/src/builtins/any-str.pir	2008-07-07 21:28:03.000000000 -0500
@@ -75,11 +75,232 @@
     .return ($P0)
 .end
 
+=item trans()
 
-=back
-
+  Implementation of transliteration
+  
 =cut
 
+.sub '!transtable' :multi(_)
+    .param pmc r
+    .local pmc retval
+    retval = new 'ResizableStringArray'
+  range_loop:
+    unless r, done
+    $S0 = r.'shift'()
+    push retval, $S0
+    goto range_loop
+  done:
+    .return(retval)
+.end
+
+.sub '!transtable' :multi('Perl6Str')
+    .param string str
+    .local pmc retval, prior, frm, to
+    .local int start, end, len, ind, skipped, r_start, r_end, s_len
+    .local string p
+    retval = new 'ResizableStringArray'
+    prior = new 'ResizableStringArray'
+    start = 0
+    skipped = 0
+    len = length str
+    end = len - 2
+  next_index:
+    ind = index str, '..' , start
+    if ind == -1 goto last_string
+    # ranges can only be after first position, before last one
+    if ind == 0 goto skip_pos
+    if ind >= end goto last_string    
+  init_range:
+    r_start = ind - 1
+    r_end = ind + 2
+  range_frm:
+    $S0 = substr str, r_start, 1
+    $I0 = ord $S0
+    # following code may be commented dep. on how we interpret
+    # spaces within ranges, like 'a .. b'
+    unless $S0 == ' ' goto range_to
+    r_start -= 1
+    if r_start < 0 goto illegal_range  
+    goto range_frm
+  range_to:
+    $S1 = substr str, r_end, 1
+    $I1 = ord $S1
+    # following code may be commented dep. on how we interpret
+    # spaces within ranges, like 'a .. b'
+    unless $S1 == ' ' goto prev_string
+    r_end += 1
+    if r_end == len goto illegal_range
+    goto range_to
+  prev_string:
+    s_len = r_start - start
+    s_len += skipped
+    unless s_len, start_range
+    p = substr str, start, s_len
+    prior = split '', p
+  process_pstring:
+    unless prior, start_range
+    $S2 = shift prior
+    push retval, $S2
+    goto process_pstring
+  start_range:
+    if $I0 > $I1 goto illegal_range
+  make_range:
+    # Here we're assuming the ordinal increments correctly for all chars.
+    # This is a bit naive for now, it definitely needs some unicode testing.
+    # If needed we can switch this over to use a true string Range
+    if $I0 > $I1 goto next_loop
+    $S2 = chr $I0
+    push retval, $S2
+    inc $I0
+    goto make_range
+  illegal_range:
+    die "Illegal range used in transliteration operator"
+  next_loop:
+    start = r_end + 1
+    goto next_index
+  skip_pos: 
+    inc start
+    inc skipped
+    goto next_index
+  last_string:
+    s_len = len - start
+    if s_len <= 0 goto done
+    p = substr str, start, s_len
+    prior = split '', p
+  process_lstring:
+    unless prior, done
+    $S0 = shift prior
+    push retval, $S0
+    goto process_lstring    
+  done:
+    .return(retval)
+.end
+
+.sub 'trans' :method
+    .param pmc args :slurpy
+    .param int del :named('d') :optional
+    .param int comp :named('c') :optional
+    .param int squash :named('s') :optional
+    .local pmc pair
+    .local pmc table
+    .local pmc itable
+    .local pmc pkey
+    .local pmc pval
+    .local pmc retv
+    .local string tmps, k, v, lv, lastmatch, comp_match
+    .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+    table = new 'Hash'
+    itable = new 'Hash'
+    retv = new 'Perl6Str'
+    tmps = self
+    lv = ''
+    
+  pair_loop:
+    unless args, st_trans
+    pair = shift args
+    isa_pair = isa pair, 'Perl6Pair'
+    unless isa_pair, pair_exception
+    pkey = pair.'key'()
+    pval = pair.'value'()
+    pkey = '!transtable'(pkey)
+    pval = '!transtable'(pval)
+    vlen = elements pval
+    if vlen goto comp_check
+    if del goto comp_check
+    pval = clone pkey
+  comp_check:
+    # for :c, I am using first element for replacing for now.  I can't find
+    # many examples where this is used otherwise
+    comp_match = pval[0]
+    
+  mapping:
+    unless pkey, pair_loop
+    k = shift pkey
+    unless pval, get_prev1
+    lv = shift pval
+  get_prev1:
+    unless del, get_prev2
+    v = ''
+  get_prev2:
+    v = lv
+    nhits = 0
+    pos = 0
+  index_loop:
+    ind = index tmps, k, pos
+    if ind == -1 goto check_elems
+    inc nhits
+    $S0 = itable[ind]
+    unless $S0, new_hit
+    # keep longest hit at that index
+    $I0 = length $S0
+    $I1 = length k
+    if $I1 < $I0 goto next_hit
+  new_hit:
+    itable[ind] = k
+  next_hit:
+    pos = ind + 1
+    goto index_loop
+  check_elems:
+    unless nhits, mapping
+    table[k] = v
+    goto mapping
+    
+  st_trans:
+    len = length tmps
+    pos = 0
+    adjpos = 0
+    v = ''
+    k = ''
+    lastmatch = ''
+    
+  table_loop:
+    if pos >= len goto done
+    k = itable[pos]
+    klen = length k
+    if comp goto complement
+  normal:
+    unless k, skip_pos
+    v = table[k]
+    goto check_squash
+  complement:
+    # may need to change dep. on how we want :c to work
+    if k, skip_pos_comp
+    v = comp_match
+    klen = 1
+  check_squash:
+    unless squash, replace
+    unless v == lastmatch goto replace
+    vlen = 0
+    substr tmps, adjpos, klen, ''
+    goto next_pos
+  replace:
+    vlen = length v
+    substr tmps, adjpos, klen, v
+  next_pos:
+    pos      += klen
+    adjpos   += vlen
+    lastmatch = v
+    goto table_loop
+  skip_pos:
+    pos      += 1
+    adjpos   += 1
+    lastmatch = ''
+    goto table_loop
+  skip_pos_comp:
+    pos      += klen
+    adjpos   += klen
+    lastmatch = ''
+    goto table_loop    
+    
+  done:
+    retv = tmps
+    .return(retv)
+    
+  pair_exception:
+    die "Must pass a List of Pairs for transliteration"
+.end
+
 # Local Variables:
 #   mode: pir
 #   fill-column: 100

@p6rt
Copy link
Author

p6rt commented Jul 15, 2008

From @moritz

Applied as r29490 (with minor whitespaces fixes and adding the
appropriate test to spectest_regression), thank you very much.

@p6rt
Copy link
Author

p6rt commented Jul 15, 2008

@moritz - Status changed from 'open' to 'resolved'

@p6rt p6rt closed this as completed Jul 15, 2008
@p6rt p6rt added the patch label Jan 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant