New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wide character in subroutine entry, DB_File #15083
Comments
From frederik@ofb.netCreated by frederik@ofb.netThe following program produces the error "Wide character in subroutine 1. perldiag mentions "Wide character in %s" but not "Wide character in 2. I guess DB_File is a bit old, but I chose it because I don't need 3. Then again, DB_File could be updated to support UTF-8. Thanks so much for a great programming language. #!/bin/perl use strict; $\ = "\n"; my $dbf = "xx.db"; my %h; # tie %h, "BerkeleyDB::Btree", -Filename=>$dbf, -Flags=>DB_CREATE; my @ents; for(@ents) { $h{$_} = 1; } print join("\n", keys %h); Perl Info
|
From @tonycozOn Tue Dec 08 14:33:46 2015, frederik@ofb.net wrote:
BerkeleyDB simply isn't warning about the lack of UT8-8 support. If I add the following to then end of your code: my @keys = keys %h; and uncomment the BerkeleyDB tie, you'll see that the key you supplied Luckily both BerkeleyDB and DB_File have a mechanism to automatically process use DBM_Filter; for BerkeleyDB: my $db = tie %h, "BerkeleyDB::Btree", -Filename=>$dbf, -Flags=>DB_CREATE; Here I'm only processing the keys, see the documentation on processing the values instead (or as well). (perldoc DBM_Filter claims to support BerkeleyDB, but doesn't appear to.)
That warning is caused by the XS code for DB_File calling SvPVbyte(), and it I'm not sure explaining that would be useful to a normal user reading the documentation.
DB_File is CPAN upstream and is maintained by the same author as BerkeleyDB. CPAN upstream issues should be reported upstream, see https://rt.cpan.org/Public/Dist/Display.html?Name=DB_File Tony |
The RT System itself - Status changed from 'new' to 'open' |
From @eserteDana Uto 08. Pro 2015, 14:33:46, frederik@ofb.net reče:
DB_File (and the underlying berkeley db engine, I guess) can handle only binary (or octets or latin1) data. There's no way to specify a specific encoding, especially for "wide characters". But if you know that you have to store data in the utf8 encoding, then you can define "DBM filters" which do the translation from wide characters into octets and vice versa automatically: for my $filter (qw(filter_store_key filter_store_value)) { Maybe something like this could be added to the DB_File documentation. Maybe there's also room for a tiny (CPAN) module, say DB_File::utf8, which does something like this automatically. Regards, |
From @eserteDana Sri 09. Pro 2015, 12:52:43, slaven@rezic.de reče:
Missed Tony's answer, and of course, DBM_Filter::utf8 is there and good enough. Regards, |
From @pmqs
Hey Tony/Slaven - thanks for dealing with this for me. Only just seen it. Just looking at the DB_File/BerkeleyDB docs I notice that I haven't actually mentioned the DBM_Filter module at all. Should also mention UTF8 in the the DB_File/BerkeleyDB docs as it is a reasonably common use-case. I see there are new tickets against the modules themselves for this issue, so that means I won't forget to do the update. cheers |
From frederik@ofb.netThank you Tony and Slaven for your replies. I'm sending to bug-DB_File@rt.cpan.org and bug-BerkeleyDB@rt.cpan.org The bug, to summarize what's below, is really a request for the I don't think any changes to the code are necessary, given what's been Thanks again! On Tue, Dec 08, 2015 at 07:50:07PM -0800, Tony Cook via RT wrote:
On Wed, Dec 09, 2015 at 12:52:43PM -0800, slaven@rezic.de via RT wrote:
On Wed, Dec 09, 2015 at 01:13:31PM -0800, slaven@rezic.de via RT wrote:
|
From @tonycozOn Thu Dec 10 15:27:33 2015, paul.marquess@ntlworld.com wrote:
The only core issue (since DB_File and BerkeleyDB are CPAN upstream) that In addition but when I tested the code from my sample calling Filter_Key_Push Is BerkeleyDB meant to support DBM_Filter? Tony |
From @pmqs
the *DB*_File modules distributed with Perl, the BerkeleyDB module,
Yes, it is. I'll take a look. Paul |
From @tonycozOn Tue, 08 Dec 2015 19:50:07 -0800, tonyc wrote:
This was reported upstream as https://rt.cpan.org/Public/Bug/Display.html?id=110248 so closing. Tony |
@tonycoz - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#126849 (status was 'resolved')
Searchable as RT126849$
The text was updated successfully, but these errors were encountered: