New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implementation of transliteration #125
Comments
From @cjfieldsAttached is a partial PIR-based implementation of trans() for Str.pir A more fully implemented p6-based version is being developed and will chris |
From @cjfields |
From @cjfieldsAttaching updated diff file (against r28209) which passes 5/17 tests. Will commit rakudo- |
From @cjfieldstrans.diff*** Str.pir 2008-06-08 23:28:40.000000000 -0500
--- Str.pir.back 2008-06-08 13:37:39.000000000 -0500
***************
*** 213,283 ****
.return(retv)
.end
- =item trans()
-
- Partial implementation of transliteration. Handles simple p6-like
- transliteration (see S05):
-
- "ABC".trans( ('A'=>'a'), ('B'=>'b'), ('C'=>'c') ) # abc
- "ABC".trans('XYZ' => 'xyz')
- "ABCXYZ".trans( (['A'..'C'] => ['a'..'c']), (<X Y Z> => <x y z>)
-
- Does not interpolate strings 'A..Z' or 'A-Z' into Ranges ('A'..'Z') yet
-
- =cut
-
- .sub 'trans' :method
- .param pmc args :slurpy
- .local pmc pair
- .local pmc table
- .local pmc tmps
- .local pmc retv
-
- table = new 'Hash'
- retv = new 'Perl6Str'
- tmps = self
-
- pair_loop:
- unless args, st_trans
- pair = shift args
- .local pmc key, val, key_it, val_it
- # should handle Array key/value pairs
- key = pair.'key'()
- val = pair.'value'()
- key_it = new 'Iterator', key
- val_it = new 'Iterator', val
-
- mapping:
- unless key_it, pair_loop
- $S0 = shift key_it
- $S1 = shift val_it
- table[$S0] = $S1
- goto mapping
-
- st_trans:
- # 'Perl6Str' iterator is not working here for some reason (not implemented)
- .local int pos, len
- .local string orig, rep
- len = tmps.'chars'()
- pos = 0
-
- trans_loop:
- if pos == len goto done
- orig = substr tmps, pos, 1
- rep = table[orig]
- if rep, next
- rep = orig
-
- next:
- concat retv, rep
- pos += 1
- goto trans_loop
-
- done:
- .return(retv)
- .end
-
- =cut
=item perl()
--- 213,218 ----
|
@cjfields - Status changed from 'new' to 'open' |
From @cjfieldsS05-transliteration/trans.t has been updated in pugs, should pass 5 of 17 tests now with last |
From @cjfieldsAttached patch passes all tests (except5 tests for tr///, which isn't implemented). |
From @cjfieldstrans.diff*** Str.pir.old 2008-06-21 16:59:14.000000000 -0500
--- Str.pir 2008-06-21 17:55:45.000000000 -0500
***************
*** 213,218 ****
--- 213,339 ----
.return(retv)
.end
+ =item trans()
+
+ Partial implementation of transliteration. Handles simple p6-like
+ transliteration (see S05):
+
+ "ABC".trans( ('A'=>'a'), ('B'=>'b'), ('C'=>'c') ) # abc
+ "ABC".trans('ABC' => 'abc')
+ "ABCXYZ".trans( (['A'..'C'] => ['a'..'c']), (<X Y Z> => <x y z>) )
+
+ Does not interpret strings 'A..Z' or 'A-Z' into Ranges ('A'..'Z') yet
+
+ =cut
+
+ .sub '!transmap' :multi(_)
+ .param pmc range
+ .local pmc retval, it
+ retval = new 'ResizablePMCArray'
+ range_loop:
+ unless range, done
+ $S0 = range.'shift'()
+ push retval, $S0
+ goto range_loop
+ done:
+ .return(retval)
+ .end
+
+ .sub '!transmap' :multi('Perl6Str')
+ .param pmc str
+ .local pmc retval, prior, next, rclass, range, list, frm, to
+ .local int end, pos
+ retval = new 'ResizablePMCArray'
+ prior = new 'ResizablePMCArray'
+ prior = str.'split'('..')
+ end = prior
+ dec end
+ pos = 0
+ process_prior:
+ unless prior, done
+ # set is a String, not a Perl6Str
+ $S0 = shift prior
+ next = new 'ResizablePMCArray'
+ next = split '', $S0
+ if pos == 0 goto process_retval
+ # set up Perl6 Range here; when .HLL mapping works for Str may refactor
+ get_from:
+ $S1 = pop retval
+ if $S1 == ' ' goto get_from
+ frm = new 'Perl6Str'
+ frm = $S1
+ get_to:
+ $S1 = shift next
+ if $S1 == ' ' goto get_to
+ to = new 'Perl6Str'
+ to = $S1
+ set_range:
+ rclass = get_hll_global 'Range'
+ range = rclass.'new'('from'=>frm, 'to'=>to)
+ process_list:
+ unless range, process_retval
+ $S2 = range.'shift'()
+ push retval, $S2
+ goto process_list
+ process_retval:
+ unless next, process_prior
+ $S3 = shift next
+ push retval, $S3
+ inc pos
+ goto process_retval
+ done:
+ .return(retval)
+ .end
+
+ .sub 'trans' :method
+ .param pmc args :slurpy
+ .local pmc pair
+ .local pmc table
+ .local pmc tmps
+ .local pmc retv
+ table = new 'Hash'
+ retv = new 'Perl6Str'
+ tmps = self
+ pair_loop:
+ unless args, st_trans
+ pair = shift args
+ .local pmc key, val, keymap, valmap
+ .local string lv
+ key = pair.'key'()
+ val = pair.'value'()
+ keymap = '!transmap'(key)
+ valmap = '!transmap'(val)
+ mapping:
+ unless keymap, pair_loop
+ # reconfigure to let val 'overflow'
+ $S0 = shift keymap
+ unless valmap, get_prev
+ lv = shift valmap
+ get_prev:
+ $S1 = lv
+ table[$S0] = $S1
+ goto mapping
+ st_trans:
+ # 'Perl6Str' iterator is not working here for some reason (not implemented)
+ .local int pos, len
+ .local string orig, rep
+ len = tmps.'chars'()
+ pos = 0
+ trans_loop:
+ if pos == len goto done
+ orig = substr tmps, pos, 1
+ rep = table[orig]
+ if rep, next
+ rep = orig
+ next:
+ concat retv, rep
+ pos += 1
+ goto trans_loop
+ done:
+ .return(retv)
+ .end
+
+ =cut
=item perl()
|
From @cjfieldsOn Sat Jun 21 16:14:31 2008, cjfields wrote:
D'oh! POD needs to be updated to reflect strings are interpreted to look for ranges. |
From @cjfieldsOn Sat Jun 21 16:33:22 2008, cjfields wrote:
Marking as stalled until I can get a Str.pir patch for handling one-to-many and many-to-one |
@cjfields - Status changed from 'open' to 'stalled' |
From @cjfieldsOn Mon Jun 23 09:14:58 2008, cjfields wrote:
New patch added, passes all tests in S05-transliteration/trans.t. I will add new spec tests for |
From @cjfieldstrans.diff*** Str.pir.old 2008-06-26 16:27:16.000000000 -0500
--- Str.pir 2008-06-26 16:27:05.000000000 -0500
***************
*** 216,221 ****
--- 216,377 ----
.return(retv)
.end
+ =item trans()
+
+ Implementation of transliteration.
+
+ TODO: Does not handle :c (complement) yet
+
+ =cut
+
+ .sub '!transtable' :multi(_)
+ .param pmc range
+ .local pmc retval, it
+ retval = new 'ResizableStringArray'
+ range_loop:
+ unless range, done
+ $S0 = range.'shift'()
+ push retval, $S0
+ goto range_loop
+ done:
+ .return(retval)
+ .end
+
+ .sub '!transtable' :multi('Perl6Str')
+ .param pmc str
+ .local pmc retval, prior, next, rclass, range, list, frm, to
+ .local int pos
+ retval = new 'ResizableStringArray'
+ prior = new 'ResizableStringArray'
+ prior = str.'split'('..')
+ pos = 0
+ process_prior:
+ unless prior, done
+ # set is a String, not a Perl6Str
+ $S0 = shift prior
+ next = new 'ResizableStringArray'
+ next = split '', $S0
+ if pos == 0 goto process_retval
+ # set up Perl6 Range here; when .HLL mapping works for Str may refactor
+ get_from:
+ $S1 = pop retval
+ if $S1 == ' ' goto get_from
+ frm = new 'Perl6Str'
+ frm = $S1
+ get_to:
+ $S1 = shift next
+ if $S1 == ' ' goto get_to
+ to = new 'Perl6Str'
+ to = $S1
+ set_range:
+ # convert to p6 Range
+ rclass = get_hll_global 'Range'
+ range = rclass.'new'('from'=>frm, 'to'=>to)
+ process_list:
+ unless range, process_retval
+ $S2 = range.'shift'()
+ push retval, $S2
+ goto process_list
+ process_retval:
+ unless next, process_prior
+ $S3 = shift next
+ push retval, $S3
+ inc pos
+ goto process_retval
+ done:
+ .return(retval)
+ .end
+
+ .sub 'trans' :method
+ .param pmc args :slurpy
+ .param int del :named('d') :optional
+ .param int squash :named('s') :optional
+ .local pmc pair
+ .local pmc table
+ .local pmc itable
+ .local pmc pkey
+ .local pmc pval
+ .local pmc retv
+ .local string tmps, k, v, lv, lastmatch
+ .local int len, klen, vlen, adjpos, pos, ind, nhits
+ table = new 'Hash'
+ itable = new 'Hash'
+ retv = new 'Perl6Str'
+ tmps = self
+ lv = ''
+ pair_loop:
+ unless args, st_trans
+ pair = shift args
+ pkey = pair.'key'()
+ pval = pair.'value'()
+ pkey = '!transtable'(pkey)
+ pval = '!transtable'(pval)
+ vlen = elements pval
+ if vlen goto mapping
+ if del goto mapping
+ pval = clone pkey
+ mapping:
+ unless pkey, pair_loop
+ k = shift pkey
+ unless pval, get_prev1
+ lv = shift pval
+ get_prev1:
+ unless del, get_prev2
+ v = ''
+ get_prev2:
+ v = lv
+ nhits = 0
+ pos = 0
+ index_loop:
+ ind = index tmps, k, pos
+ if ind == -1 goto check_elems
+ inc nhits
+ $S0 = itable[ind]
+ unless $S0, new_hit
+ $I0 = length $S0
+ $I1 = length k
+ if $I1 < $I0 goto next_hit
+ new_hit:
+ itable[ind] = k
+ next_hit:
+ pos = ind + 1
+ goto index_loop
+ check_elems:
+ unless nhits, mapping
+ table[k] = v
+ goto mapping
+ st_trans:
+ len = length tmps
+ pos = 0
+ adjpos = 0
+ lastmatch = ''
+ table_loop:
+ if pos == len goto done
+ k = itable[pos]
+ unless k, next_pos2
+ v = table[k]
+ unless squash, replace
+ if k != lastmatch goto replace
+ del_v:
+ v = ''
+ replace:
+ vlen = length v
+ klen = length k
+ say v
+ substr tmps, adjpos, klen, v
+ next_pos1:
+ lastmatch = k
+ adjpos += vlen
+ pos += klen
+ goto table_loop
+ next_pos2:
+ inc pos
+ inc adjpos
+ goto table_loop
+ done:
+ retv = tmps
+ .return(retv)
+ .end
=item perl()
|
The RT System itself - Status changed from 'stalled' to 'open' |
From @cjfieldsMy bad; I left an extraneous 'say' debugging line in the last patch; attaching new one. Surprised |
From @cjfieldstrans.diff*** Str.pir.old 2008-06-26 16:27:16.000000000 -0500
--- Str.pir 2008-06-26 16:41:45.000000000 -0500
***************
*** 216,221 ****
--- 216,376 ----
.return(retv)
.end
+ =item trans()
+
+ Implementation of transliteration.
+
+ TODO: Does not handle :c (complement) yet
+
+ =cut
+
+ .sub '!transtable' :multi(_)
+ .param pmc range
+ .local pmc retval, it
+ retval = new 'ResizableStringArray'
+ range_loop:
+ unless range, done
+ $S0 = range.'shift'()
+ push retval, $S0
+ goto range_loop
+ done:
+ .return(retval)
+ .end
+
+ .sub '!transtable' :multi('Perl6Str')
+ .param pmc str
+ .local pmc retval, prior, next, rclass, range, list, frm, to
+ .local int pos
+ retval = new 'ResizableStringArray'
+ prior = new 'ResizableStringArray'
+ prior = str.'split'('..')
+ pos = 0
+ process_prior:
+ unless prior, done
+ # set is a String, not a Perl6Str
+ $S0 = shift prior
+ next = new 'ResizableStringArray'
+ next = split '', $S0
+ if pos == 0 goto process_retval
+ # set up Perl6 Range here; when .HLL mapping works for Str may refactor
+ get_from:
+ $S1 = pop retval
+ if $S1 == ' ' goto get_from
+ frm = new 'Perl6Str'
+ frm = $S1
+ get_to:
+ $S1 = shift next
+ if $S1 == ' ' goto get_to
+ to = new 'Perl6Str'
+ to = $S1
+ set_range:
+ # convert to p6 Range
+ rclass = get_hll_global 'Range'
+ range = rclass.'new'('from'=>frm, 'to'=>to)
+ process_list:
+ unless range, process_retval
+ $S2 = range.'shift'()
+ push retval, $S2
+ goto process_list
+ process_retval:
+ unless next, process_prior
+ $S3 = shift next
+ push retval, $S3
+ inc pos
+ goto process_retval
+ done:
+ .return(retval)
+ .end
+
+ .sub 'trans' :method
+ .param pmc args :slurpy
+ .param int del :named('d') :optional
+ .param int squash :named('s') :optional
+ .local pmc pair
+ .local pmc table
+ .local pmc itable
+ .local pmc pkey
+ .local pmc pval
+ .local pmc retv
+ .local string tmps, k, v, lv, lastmatch
+ .local int len, klen, vlen, adjpos, pos, ind, nhits
+ table = new 'Hash'
+ itable = new 'Hash'
+ retv = new 'Perl6Str'
+ tmps = self
+ lv = ''
+ pair_loop:
+ unless args, st_trans
+ pair = shift args
+ pkey = pair.'key'()
+ pval = pair.'value'()
+ pkey = '!transtable'(pkey)
+ pval = '!transtable'(pval)
+ vlen = elements pval
+ if vlen goto mapping
+ if del goto mapping
+ pval = clone pkey
+ mapping:
+ unless pkey, pair_loop
+ k = shift pkey
+ unless pval, get_prev1
+ lv = shift pval
+ get_prev1:
+ unless del, get_prev2
+ v = ''
+ get_prev2:
+ v = lv
+ nhits = 0
+ pos = 0
+ index_loop:
+ ind = index tmps, k, pos
+ if ind == -1 goto check_elems
+ inc nhits
+ $S0 = itable[ind]
+ unless $S0, new_hit
+ $I0 = length $S0
+ $I1 = length k
+ if $I1 < $I0 goto next_hit
+ new_hit:
+ itable[ind] = k
+ next_hit:
+ pos = ind + 1
+ goto index_loop
+ check_elems:
+ unless nhits, mapping
+ table[k] = v
+ goto mapping
+ st_trans:
+ len = length tmps
+ pos = 0
+ adjpos = 0
+ lastmatch = ''
+ table_loop:
+ if pos == len goto done
+ k = itable[pos]
+ unless k, next_pos2
+ v = table[k]
+ unless squash, replace
+ if k != lastmatch goto replace
+ del_v:
+ v = ''
+ replace:
+ vlen = length v
+ klen = length k
+ substr tmps, adjpos, klen, v
+ next_pos1:
+ lastmatch = k
+ adjpos += vlen
+ pos += klen
+ goto table_loop
+ next_pos2:
+ inc pos
+ inc adjpos
+ goto table_loop
+ done:
+ retv = tmps
+ .return(retv)
+ .end
=item perl()
|
From @cjfieldsThis should be the final (fully implemented) version of trans(). Complement, squash, and delete Not sure whether this belongs in Str or Any, so I placed in in Str.pir for now. It's possible some |
From @cjfieldstrans.diff*** Str.pir.old 2008-07-05 20:25:35.000000000 -0500
--- Str.pir 2008-07-05 20:41:08.000000000 -0500
***************
*** 206,211 ****
--- 206,432 ----
.end
+ =item trans()
+
+ Method implementation of transliteration
+
+ =cut
+
+ .sub '!transtable' :multi(_)
+ .param pmc r
+ .local pmc retval
+ retval = new 'ResizableStringArray'
+ range_loop:
+ unless r, done
+ $S0 = r.'shift'()
+ push retval, $S0
+ goto range_loop
+ done:
+ .return(retval)
+ .end
+
+ .sub '!transtable' :multi('Perl6Str')
+ .param string str
+ .local pmc retval, prior, frm, to
+ .local int start, end, len, ind, skipped, r_start, r_end, s_len
+ .local string p
+ retval = new 'ResizableStringArray'
+ prior = new 'ResizableStringArray'
+ start = 0
+ skipped = 0
+ len = length str
+ end = len - 2
+ next_index:
+ ind = index str, '..' , start
+ if ind == -1 goto last_string
+ # ranges can only be after first position, before last one
+ if ind == 0 goto skip_pos
+ if ind >= end goto last_string
+ init_range:
+ r_start = ind - 1
+ r_end = ind + 2
+ range_frm:
+ $S0 = substr str, r_start, 1
+ $I0 = ord $S0
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S0 == ' ' goto range_to
+ r_start -= 1
+ if r_start < 0 goto illegal_range
+ goto range_frm
+ range_to:
+ $S1 = substr str, r_end, 1
+ $I1 = ord $S1
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S1 == ' ' goto prev_string
+ r_end += 1
+ if r_end == len goto illegal_range
+ goto range_to
+ prev_string:
+ s_len = r_start - start
+ s_len += skipped
+ unless s_len, start_range
+ p = substr str, start, s_len
+ prior = split '', p
+ process_pstring:
+ unless prior, start_range
+ $S2 = shift prior
+ push retval, $S2
+ goto process_pstring
+ start_range:
+ if $I0 > $I1 goto illegal_range
+ make_range:
+ # Here we're assuming the ordinal increments correctly for all chars.
+ # This is a bit naive for now, it definitely needs some unicode testing.
+ # If needed we can switch this over to use a true string Range
+ if $I0 > $I1 goto next_loop
+ $S2 = chr $I0
+ push retval, $S2
+ inc $I0
+ goto make_range
+ illegal_range:
+ die "Illegal range used in transliteration operator"
+ next_loop:
+ start = r_end + 1
+ goto next_index
+ skip_pos:
+ inc start
+ inc skipped
+ goto next_index
+ last_string:
+ s_len = len - start
+ if s_len <= 0 goto done
+ p = substr str, start, s_len
+ prior = split '', p
+ process_lstring:
+ unless prior, done
+ $S0 = shift prior
+ push retval, $S0
+ goto process_lstring
+ done:
+ .return(retval)
+ .end
+
+ .sub 'trans' :method
+ .param pmc args :slurpy
+ .param int del :named('d') :optional
+ .param int comp :named('c') :optional
+ .param int squash :named('s') :optional
+ .local pmc pair
+ .local pmc table
+ .local pmc itable
+ .local pmc pkey
+ .local pmc pval
+ .local pmc retv
+ .local string tmps, k, v, lv, lastmatch, comp_match
+ .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+ table = new 'Hash'
+ itable = new 'Hash' # should used ordered hash here
+ retv = new 'Perl6Str'
+ tmps = self
+ lv = ''
+
+ pair_loop:
+ unless args, st_trans
+ pair = shift args
+ isa_pair = isa pair
+ unless isa_pair, pair_exception
+ pkey = pair.'key'()
+ pval = pair.'value'()
+ pkey = '!transtable'(pkey)
+ pval = '!transtable'(pval)
+ vlen = elements pval
+ if vlen goto comp_check
+ if del goto comp_check
+ pval = clone pkey
+ comp_check:
+ # for :c, I am using first element for replacing for now. I can't find
+ # many examples where this is used otherwise
+ comp_match = pval[0]
+
+ mapping:
+ unless pkey, pair_loop
+ k = shift pkey
+ unless pval, get_prev1
+ lv = shift pval
+ get_prev1:
+ unless del, get_prev2
+ v = ''
+ get_prev2:
+ v = lv
+ nhits = 0
+ pos = 0
+ index_loop:
+ ind = index tmps, k, pos
+ if ind == -1 goto check_elems
+ inc nhits
+ $S0 = itable[ind]
+ unless $S0, new_hit
+ # keep longest hit at that index
+ $I0 = length $S0
+ $I1 = length k
+ if $I1 < $I0 goto next_hit
+ new_hit:
+ itable[ind] = k
+ next_hit:
+ pos = ind + 1
+ goto index_loop
+ check_elems:
+ unless nhits, mapping
+ table[k] = v
+ goto mapping
+
+ st_trans:
+ len = length tmps
+ pos = 0
+ adjpos = 0
+ v = ''
+ k = ''
+ lastmatch = ''
+
+ table_loop:
+ if pos >= len goto done
+ k = itable[pos]
+ if comp goto complement
+ normal:
+ unless k, skip_pos
+ v = table[k]
+ klen = length k
+ goto check_squash
+ complement:
+ # may need to change dep. on how we want :c to work
+ if k, skip_pos
+ v = comp_match
+ klen = 1
+ check_squash:
+ unless squash, replace
+ unless v == lastmatch goto replace
+ vlen = 0
+ substr tmps, adjpos, klen, ''
+ goto next_pos
+ replace:
+ vlen = length v
+ substr tmps, adjpos, klen, v
+ next_pos:
+ pos += klen
+ adjpos += vlen
+ lastmatch = v
+ goto table_loop
+ skip_pos:
+ inc pos
+ inc adjpos
+ lastmatch = ''
+ goto table_loop
+
+ done:
+ retv = tmps
+ .return(retv)
+
+ pair_exception:
+ die "Must pass a List of Pairs for transliteration"
+ .end
+
=item perl()
Returns a Perl representation of the Str.
|
From @cjfieldsHad a type check which failed in the last patch (bad op). This is the corrected patch. |
From @cjfieldstrans.diff*** Str.pir.old 2008-07-05 20:25:35.000000000 -0500
--- Str.pir 2008-07-05 23:13:12.000000000 -0500
***************
*** 205,210 ****
--- 205,436 ----
.return(retv)
.end
+ =item match()
+
+ Method implementation of match (m//)
+
+ =cut
+
+ =item trans()
+
+ Method implementation of transliteration
+
+ =cut
+
+ .sub '!transtable' :multi(_)
+ .param pmc r
+ .local pmc retval
+ retval = new 'ResizableStringArray'
+ range_loop:
+ unless r, done
+ $S0 = r.'shift'()
+ push retval, $S0
+ goto range_loop
+ done:
+ .return(retval)
+ .end
+
+ .sub '!transtable' :multi('Perl6Str')
+ .param string str
+ .local pmc retval, prior, frm, to
+ .local int start, end, len, ind, skipped, r_start, r_end, s_len
+ .local string p
+ retval = new 'ResizableStringArray'
+ prior = new 'ResizableStringArray'
+ start = 0
+ skipped = 0
+ len = length str
+ end = len - 2
+ next_index:
+ ind = index str, '..' , start
+ if ind == -1 goto last_string
+ # ranges can only be after first position, before last one
+ if ind == 0 goto skip_pos
+ if ind >= end goto last_string
+ init_range:
+ r_start = ind - 1
+ r_end = ind + 2
+ range_frm:
+ $S0 = substr str, r_start, 1
+ $I0 = ord $S0
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S0 == ' ' goto range_to
+ r_start -= 1
+ if r_start < 0 goto illegal_range
+ goto range_frm
+ range_to:
+ $S1 = substr str, r_end, 1
+ $I1 = ord $S1
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S1 == ' ' goto prev_string
+ r_end += 1
+ if r_end == len goto illegal_range
+ goto range_to
+ prev_string:
+ s_len = r_start - start
+ s_len += skipped
+ unless s_len, start_range
+ p = substr str, start, s_len
+ prior = split '', p
+ process_pstring:
+ unless prior, start_range
+ $S2 = shift prior
+ push retval, $S2
+ goto process_pstring
+ start_range:
+ if $I0 > $I1 goto illegal_range
+ make_range:
+ # Here we're assuming the ordinal increments correctly for all chars.
+ # This is a bit naive for now, it definitely needs some unicode testing.
+ # If needed we can switch this over to use a true string Range
+ if $I0 > $I1 goto next_loop
+ $S2 = chr $I0
+ push retval, $S2
+ inc $I0
+ goto make_range
+ illegal_range:
+ die "Illegal range used in transliteration operator"
+ next_loop:
+ start = r_end + 1
+ goto next_index
+ skip_pos:
+ inc start
+ inc skipped
+ goto next_index
+ last_string:
+ s_len = len - start
+ if s_len <= 0 goto done
+ p = substr str, start, s_len
+ prior = split '', p
+ process_lstring:
+ unless prior, done
+ $S0 = shift prior
+ push retval, $S0
+ goto process_lstring
+ done:
+ .return(retval)
+ .end
+
+ .sub 'trans' :method
+ .param pmc args :slurpy
+ .param int del :named('d') :optional
+ .param int comp :named('c') :optional
+ .param int squash :named('s') :optional
+ .local pmc pair
+ .local pmc table
+ .local pmc itable
+ .local pmc pkey
+ .local pmc pval
+ .local pmc retv
+ .local string tmps, k, v, lv, lastmatch, comp_match
+ .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+ table = new 'Hash'
+ itable = new 'Hash' # should used ordered hash here
+ retv = new 'Perl6Str'
+ tmps = self
+ lv = ''
+
+ pair_loop:
+ unless args, st_trans
+ pair = shift args
+ isa_pair = isa pair, 'Perl6Pair'
+ unless isa_pair, pair_exception
+ pkey = pair.'key'()
+ pval = pair.'value'()
+ pkey = '!transtable'(pkey)
+ pval = '!transtable'(pval)
+ vlen = elements pval
+ if vlen goto comp_check
+ if del goto comp_check
+ pval = clone pkey
+ comp_check:
+ # for :c, I am using first element for replacing for now. I can't find
+ # many examples where this is used otherwise
+ comp_match = pval[0]
+
+ mapping:
+ unless pkey, pair_loop
+ k = shift pkey
+ unless pval, get_prev1
+ lv = shift pval
+ get_prev1:
+ unless del, get_prev2
+ v = ''
+ get_prev2:
+ v = lv
+ nhits = 0
+ pos = 0
+ index_loop:
+ ind = index tmps, k, pos
+ if ind == -1 goto check_elems
+ inc nhits
+ $S0 = itable[ind]
+ unless $S0, new_hit
+ # keep longest hit at that index
+ $I0 = length $S0
+ $I1 = length k
+ if $I1 < $I0 goto next_hit
+ new_hit:
+ itable[ind] = k
+ next_hit:
+ pos = ind + 1
+ goto index_loop
+ check_elems:
+ unless nhits, mapping
+ table[k] = v
+ goto mapping
+
+ st_trans:
+ len = length tmps
+ pos = 0
+ adjpos = 0
+ v = ''
+ k = ''
+ lastmatch = ''
+
+ table_loop:
+ if pos >= len goto done
+ k = itable[pos]
+ if comp goto complement
+ normal:
+ unless k, skip_pos
+ v = table[k]
+ klen = length k
+ goto check_squash
+ complement:
+ # may need to change dep. on how we want :c to work
+ if k, skip_pos
+ v = comp_match
+ klen = 1
+ check_squash:
+ unless squash, replace
+ unless v == lastmatch goto replace
+ vlen = 0
+ substr tmps, adjpos, klen, ''
+ goto next_pos
+ replace:
+ vlen = length v
+ substr tmps, adjpos, klen, v
+ next_pos:
+ pos += klen
+ adjpos += vlen
+ lastmatch = v
+ goto table_loop
+ skip_pos:
+ inc pos
+ inc adjpos
+ lastmatch = ''
+ goto table_loop
+
+ done:
+ retv = tmps
+ .return(retv)
+
+ pair_exception:
+ die "Must pass a List of Pairs for transliteration"
+ .end
=item perl()
|
From @cjfieldsCode moved to builtins/any-str.pir. New patch attached. |
From @cjfieldstrans.diff*** any-str.pir.old 2008-07-06 15:42:32.000000000 -0500
--- any-str.pir 2008-07-06 15:38:10.000000000 -0500
***************
*** 75,85 ****
.return ($P0)
.end
! =back
!
=cut
# Local Variables:
# mode: pir
# fill-column: 100
--- 75,301 ----
.return ($P0)
.end
+ =item trans()
! Implementation of transliteration
!
=cut
+ .sub '!transtable' :multi(_)
+ .param pmc r
+ .local pmc retval
+ retval = new 'ResizableStringArray'
+ range_loop:
+ unless r, done
+ $S0 = r.'shift'()
+ push retval, $S0
+ goto range_loop
+ done:
+ .return(retval)
+ .end
+
+ .sub '!transtable' :multi('Perl6Str')
+ .param string str
+ .local pmc retval, prior, frm, to
+ .local int start, end, len, ind, skipped, r_start, r_end, s_len
+ .local string p
+ retval = new 'ResizableStringArray'
+ prior = new 'ResizableStringArray'
+ start = 0
+ skipped = 0
+ len = length str
+ end = len - 2
+ next_index:
+ ind = index str, '..' , start
+ if ind == -1 goto last_string
+ # ranges can only be after first position, before last one
+ if ind == 0 goto skip_pos
+ if ind >= end goto last_string
+ init_range:
+ r_start = ind - 1
+ r_end = ind + 2
+ range_frm:
+ $S0 = substr str, r_start, 1
+ $I0 = ord $S0
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S0 == ' ' goto range_to
+ r_start -= 1
+ if r_start < 0 goto illegal_range
+ goto range_frm
+ range_to:
+ $S1 = substr str, r_end, 1
+ $I1 = ord $S1
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S1 == ' ' goto prev_string
+ r_end += 1
+ if r_end == len goto illegal_range
+ goto range_to
+ prev_string:
+ s_len = r_start - start
+ s_len += skipped
+ unless s_len, start_range
+ p = substr str, start, s_len
+ prior = split '', p
+ process_pstring:
+ unless prior, start_range
+ $S2 = shift prior
+ push retval, $S2
+ goto process_pstring
+ start_range:
+ if $I0 > $I1 goto illegal_range
+ make_range:
+ # Here we're assuming the ordinal increments correctly for all chars.
+ # This is a bit naive for now, it definitely needs some unicode testing.
+ # If needed we can switch this over to use a true string Range
+ if $I0 > $I1 goto next_loop
+ $S2 = chr $I0
+ push retval, $S2
+ inc $I0
+ goto make_range
+ illegal_range:
+ die "Illegal range used in transliteration operator"
+ next_loop:
+ start = r_end + 1
+ goto next_index
+ skip_pos:
+ inc start
+ inc skipped
+ goto next_index
+ last_string:
+ s_len = len - start
+ if s_len <= 0 goto done
+ p = substr str, start, s_len
+ prior = split '', p
+ process_lstring:
+ unless prior, done
+ $S0 = shift prior
+ push retval, $S0
+ goto process_lstring
+ done:
+ .return(retval)
+ .end
+
+ .sub 'trans' :method
+ .param pmc args :slurpy
+ .param int del :named('d') :optional
+ .param int comp :named('c') :optional
+ .param int squash :named('s') :optional
+ .local pmc pair
+ .local pmc table
+ .local pmc itable
+ .local pmc pkey
+ .local pmc pval
+ .local pmc retv
+ .local string tmps, k, v, lv, lastmatch, comp_match
+ .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+ table = new 'Hash'
+ itable = new 'Hash' # should used ordered hash here
+ retv = new 'Perl6Str'
+ tmps = self
+ lv = ''
+
+ pair_loop:
+ unless args, st_trans
+ pair = shift args
+ isa_pair = isa pair, 'Perl6Pair'
+ unless isa_pair, pair_exception
+ pkey = pair.'key'()
+ pval = pair.'value'()
+ pkey = '!transtable'(pkey)
+ pval = '!transtable'(pval)
+ vlen = elements pval
+ if vlen goto comp_check
+ if del goto comp_check
+ pval = clone pkey
+ comp_check:
+ # for :c, I am using first element for replacing for now. I can't find
+ # many examples where this is used otherwise
+ comp_match = pval[0]
+
+ mapping:
+ unless pkey, pair_loop
+ k = shift pkey
+ unless pval, get_prev1
+ lv = shift pval
+ get_prev1:
+ unless del, get_prev2
+ v = ''
+ get_prev2:
+ v = lv
+ nhits = 0
+ pos = 0
+ index_loop:
+ ind = index tmps, k, pos
+ if ind == -1 goto check_elems
+ inc nhits
+ $S0 = itable[ind]
+ unless $S0, new_hit
+ # keep longest hit at that index
+ $I0 = length $S0
+ $I1 = length k
+ if $I1 < $I0 goto next_hit
+ new_hit:
+ itable[ind] = k
+ next_hit:
+ pos = ind + 1
+ goto index_loop
+ check_elems:
+ unless nhits, mapping
+ table[k] = v
+ goto mapping
+
+ st_trans:
+ len = length tmps
+ pos = 0
+ adjpos = 0
+ v = ''
+ k = ''
+ lastmatch = ''
+
+ table_loop:
+ if pos >= len goto done
+ k = itable[pos]
+ if comp goto complement
+ normal:
+ unless k, skip_pos
+ v = table[k]
+ klen = length k
+ goto check_squash
+ complement:
+ # may need to change dep. on how we want :c to work
+ if k, skip_pos
+ v = comp_match
+ klen = 1
+ check_squash:
+ unless squash, replace
+ unless v == lastmatch goto replace
+ vlen = 0
+ substr tmps, adjpos, klen, ''
+ goto next_pos
+ replace:
+ vlen = length v
+ substr tmps, adjpos, klen, v
+ next_pos:
+ pos += klen
+ adjpos += vlen
+ lastmatch = v
+ goto table_loop
+ skip_pos:
+ inc pos
+ inc adjpos
+ lastmatch = ''
+ goto table_loop
+
+ done:
+ retv = tmps
+ .return(retv)
+
+ pair_exception:
+ die "Must pass a List of Pairs for transliteration"
+ .end
+
# Local Variables:
# mode: pir
# fill-column: 100
|
From @cjfieldsOn Sun Jul 06 14:18:57 2008, cjfields wrote:
Latest patch works with :c, :s using a many-to-many transliteration using arrays, as described |
From @cjfieldstrans.diff--- languages/perl6/src/builtins/any-str.pir.old 2008-07-06 15:42:32.000000000 -0500
+++ languages/perl6/src/builtins/any-str.pir 2008-07-07 21:28:03.000000000 -0500
@@ -75,11 +75,232 @@
.return ($P0)
.end
+=item trans()
-=back
-
+ Implementation of transliteration
+
=cut
+.sub '!transtable' :multi(_)
+ .param pmc r
+ .local pmc retval
+ retval = new 'ResizableStringArray'
+ range_loop:
+ unless r, done
+ $S0 = r.'shift'()
+ push retval, $S0
+ goto range_loop
+ done:
+ .return(retval)
+.end
+
+.sub '!transtable' :multi('Perl6Str')
+ .param string str
+ .local pmc retval, prior, frm, to
+ .local int start, end, len, ind, skipped, r_start, r_end, s_len
+ .local string p
+ retval = new 'ResizableStringArray'
+ prior = new 'ResizableStringArray'
+ start = 0
+ skipped = 0
+ len = length str
+ end = len - 2
+ next_index:
+ ind = index str, '..' , start
+ if ind == -1 goto last_string
+ # ranges can only be after first position, before last one
+ if ind == 0 goto skip_pos
+ if ind >= end goto last_string
+ init_range:
+ r_start = ind - 1
+ r_end = ind + 2
+ range_frm:
+ $S0 = substr str, r_start, 1
+ $I0 = ord $S0
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S0 == ' ' goto range_to
+ r_start -= 1
+ if r_start < 0 goto illegal_range
+ goto range_frm
+ range_to:
+ $S1 = substr str, r_end, 1
+ $I1 = ord $S1
+ # following code may be commented dep. on how we interpret
+ # spaces within ranges, like 'a .. b'
+ unless $S1 == ' ' goto prev_string
+ r_end += 1
+ if r_end == len goto illegal_range
+ goto range_to
+ prev_string:
+ s_len = r_start - start
+ s_len += skipped
+ unless s_len, start_range
+ p = substr str, start, s_len
+ prior = split '', p
+ process_pstring:
+ unless prior, start_range
+ $S2 = shift prior
+ push retval, $S2
+ goto process_pstring
+ start_range:
+ if $I0 > $I1 goto illegal_range
+ make_range:
+ # Here we're assuming the ordinal increments correctly for all chars.
+ # This is a bit naive for now, it definitely needs some unicode testing.
+ # If needed we can switch this over to use a true string Range
+ if $I0 > $I1 goto next_loop
+ $S2 = chr $I0
+ push retval, $S2
+ inc $I0
+ goto make_range
+ illegal_range:
+ die "Illegal range used in transliteration operator"
+ next_loop:
+ start = r_end + 1
+ goto next_index
+ skip_pos:
+ inc start
+ inc skipped
+ goto next_index
+ last_string:
+ s_len = len - start
+ if s_len <= 0 goto done
+ p = substr str, start, s_len
+ prior = split '', p
+ process_lstring:
+ unless prior, done
+ $S0 = shift prior
+ push retval, $S0
+ goto process_lstring
+ done:
+ .return(retval)
+.end
+
+.sub 'trans' :method
+ .param pmc args :slurpy
+ .param int del :named('d') :optional
+ .param int comp :named('c') :optional
+ .param int squash :named('s') :optional
+ .local pmc pair
+ .local pmc table
+ .local pmc itable
+ .local pmc pkey
+ .local pmc pval
+ .local pmc retv
+ .local string tmps, k, v, lv, lastmatch, comp_match
+ .local int len, klen, vlen, adjpos, pos, ind, nhits, isa_pair
+ table = new 'Hash'
+ itable = new 'Hash'
+ retv = new 'Perl6Str'
+ tmps = self
+ lv = ''
+
+ pair_loop:
+ unless args, st_trans
+ pair = shift args
+ isa_pair = isa pair, 'Perl6Pair'
+ unless isa_pair, pair_exception
+ pkey = pair.'key'()
+ pval = pair.'value'()
+ pkey = '!transtable'(pkey)
+ pval = '!transtable'(pval)
+ vlen = elements pval
+ if vlen goto comp_check
+ if del goto comp_check
+ pval = clone pkey
+ comp_check:
+ # for :c, I am using first element for replacing for now. I can't find
+ # many examples where this is used otherwise
+ comp_match = pval[0]
+
+ mapping:
+ unless pkey, pair_loop
+ k = shift pkey
+ unless pval, get_prev1
+ lv = shift pval
+ get_prev1:
+ unless del, get_prev2
+ v = ''
+ get_prev2:
+ v = lv
+ nhits = 0
+ pos = 0
+ index_loop:
+ ind = index tmps, k, pos
+ if ind == -1 goto check_elems
+ inc nhits
+ $S0 = itable[ind]
+ unless $S0, new_hit
+ # keep longest hit at that index
+ $I0 = length $S0
+ $I1 = length k
+ if $I1 < $I0 goto next_hit
+ new_hit:
+ itable[ind] = k
+ next_hit:
+ pos = ind + 1
+ goto index_loop
+ check_elems:
+ unless nhits, mapping
+ table[k] = v
+ goto mapping
+
+ st_trans:
+ len = length tmps
+ pos = 0
+ adjpos = 0
+ v = ''
+ k = ''
+ lastmatch = ''
+
+ table_loop:
+ if pos >= len goto done
+ k = itable[pos]
+ klen = length k
+ if comp goto complement
+ normal:
+ unless k, skip_pos
+ v = table[k]
+ goto check_squash
+ complement:
+ # may need to change dep. on how we want :c to work
+ if k, skip_pos_comp
+ v = comp_match
+ klen = 1
+ check_squash:
+ unless squash, replace
+ unless v == lastmatch goto replace
+ vlen = 0
+ substr tmps, adjpos, klen, ''
+ goto next_pos
+ replace:
+ vlen = length v
+ substr tmps, adjpos, klen, v
+ next_pos:
+ pos += klen
+ adjpos += vlen
+ lastmatch = v
+ goto table_loop
+ skip_pos:
+ pos += 1
+ adjpos += 1
+ lastmatch = ''
+ goto table_loop
+ skip_pos_comp:
+ pos += klen
+ adjpos += klen
+ lastmatch = ''
+ goto table_loop
+
+ done:
+ retv = tmps
+ .return(retv)
+
+ pair_exception:
+ die "Must pass a List of Pairs for transliteration"
+.end
+
# Local Variables:
# mode: pir
# fill-column: 100
|
From @moritzApplied as r29490 (with minor whitespaces fixes and adding the |
@moritz - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#55492 (status was 'resolved')
Searchable as RT55492$
The text was updated successfully, but these errors were encountered: