Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix const correctness for header files #15732

Closed
p5pRT opened this issue Nov 24, 2016 · 13 comments
Closed

Fix const correctness for header files #15732

p5pRT opened this issue Nov 24, 2016 · 13 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 24, 2016

Migrated from rt.perl.org#130169 (status was 'resolved')

Searchable as RT130169$

@p5pRT
Copy link
Author

p5pRT commented Nov 24, 2016

From @ppisar

Hello,

when building a code that includes

#include "EXTERN.h
#include "perl.h"

with GCC's -Wcast-qual option, I get many warnings about discarding const
qualifiers like this​:

$ printf '#include "EXTERN.h"\n#include "perl.h"\n' | gcc -Wcast-qual -I/usr/lib64/perl5/CORE -c -x c -
In file included from /usr/lib64/perl5/CORE/hv.h​:629​:0,
  from /usr/lib64/perl5/CORE/perl.h​:3740,
  from <stdin>​:2​:
/usr/lib64/perl5/CORE/hv_func.h​: In function ‘S_perl_hash_siphash_2_4’​:
/usr/lib64/perl5/CORE/hv_func.h​:213​:17​: warning​: cast discards ‘const’ qualifier from pointer target type [-Wcast-qual]
  U64TYPE k0 = ((U64TYPE*)seed)[0];
  ^

This makes difficult to spot issues in non-perl code and people complain
(https://bugzilla.redhat.com/show_bug.cgi?id=1242980).

Series of three patches is attached that tries to fix this issue​:

First patch fixes hash functions in hv_func.h. Because their input is already
const and their output does not rely pointers to input data, this patch seems
simple and non-controversial.

Second patch touches utf8.h and utfebcdic.h. It changes dereferencing string
bytes in is_UTF8-* check macros. While these are macros without argument types,
I think its reasonable to assume the input string does not change during the
check, so the change is desired. The only unusual thing is the changed code
was originally generated by a script, then hand-editted, and then comes my
patch. I did not try to locate the disfunctional (or deleted?) generator in
regen/regcharclass.pl and ammend it.

The third and last patch touches utf8_hop* functions. This is the most
controversal change as it changes their prototype. I cannot see a way how to
get rid of the warnings there without changing prototype. I looked into glibc
sources how char *strchr(const char *, int) is solved. It isn't. It also
produces many warnings. Here I doubt about helpfullness of the -Wcast-qual
warnings.

-- Petr

@p5pRT
Copy link
Author

p5pRT commented Nov 24, 2016

From @ppisar

0001-Fix-const-correctness-in-hv_func.h.patch
From ee8c4bf20643c07d04f6a5927096bf7b89fcd300 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Petr=20P=C3=ADsa=C5=99?= <ppisar@redhat.com>
Date: Thu, 24 Nov 2016 16:34:09 +0100
Subject: [PATCH 1/3] Fix const correctness in hv_func.h
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Building an XS code with -Wcast-qual yielded warnings about discarding
const qualifiers from pointer targets like:

$ printf '#include "EXTERN.h"\n#include "perl.h"\n' | gcc -Wcast-qual -I/usr/lib64/perl5/CORE -c -x c -
In file included from /usr/lib64/perl5/CORE/hv.h:629:0,
                 from /usr/lib64/perl5/CORE/perl.h:3740,
                 from <stdin>:2:
/usr/lib64/perl5/CORE/hv_func.h: In function ‘S_perl_hash_siphash_2_4’:
/usr/lib64/perl5/CORE/hv_func.h:213:17: warning: cast discards ‘const’ qualifier from pointer target type [-Wcast-qual]
   U64TYPE k0 = ((U64TYPE*)seed)[0];
                 ^

Signed-off-by: Petr Písař <ppisar@redhat.com>
---
 hv_func.h | 22 +++++++++++-----------
 1 file changed, 11 insertions(+), 11 deletions(-)

diff --git a/hv_func.h b/hv_func.h
index 8866db9..57b1ed1 100644
--- a/hv_func.h
+++ b/hv_func.h
@@ -118,7 +118,7 @@
 
 #if (BYTEORDER == 0x1234 || BYTEORDER == 0x12345678) && U32SIZE == 4
   /* CPU endian matches murmurhash algorithm, so read 32-bit word directly */
-  #define U8TO32_LE(ptr)   (*((U32*)(ptr)))
+  #define U8TO32_LE(ptr)   (*((const U32*)(ptr)))
 #elif BYTEORDER == 0x4321 || BYTEORDER == 0x87654321
   /* TODO: Add additional cases below where a compiler provided bswap32 is available */
   #if defined(__GNUC__) && (__GNUC__>4 || (__GNUC__==4 && __GNUC_MINOR__>=3))
@@ -210,8 +210,8 @@ S_perl_hash_siphash_2_4(const unsigned char * const seed, const unsigned char *i
   U64 v3 = UINT64_C(0x7465646279746573);
 
   U64 b;
-  U64 k0 = ((U64*)seed)[0];
-  U64 k1 = ((U64*)seed)[1];
+  U64 k0 = ((const U64*)seed)[0];
+  U64 k1 = ((const U64*)seed)[1];
   U64 m;
   const int left = inlen & 7;
   const U8 *end = in + inlen - left;
@@ -269,7 +269,7 @@ S_perl_hash_siphash_2_4(const unsigned char * const seed, const unsigned char *i
 
 PERL_STATIC_INLINE U32
 S_perl_hash_superfast(const unsigned char * const seed, const unsigned char *str, STRLEN len) {
-    U32 hash = *((U32*)seed) + (U32)len;
+    U32 hash = *((const U32*)seed) + (U32)len;
     U32 tmp;
     int rem= len & 3;
     len >>= 2;
@@ -373,7 +373,7 @@ S_perl_hash_superfast(const unsigned char * const seed, const unsigned char *str
 /* now we create the hash function */
 PERL_STATIC_INLINE U32
 S_perl_hash_murmur3(const unsigned char * const seed, const unsigned char *ptr, STRLEN len) {
-    U32 h1 = *((U32*)seed);
+    U32 h1 = *((const U32*)seed);
     U32 k1;
     U32 carry = 0;
 
@@ -467,7 +467,7 @@ S_perl_hash_murmur3(const unsigned char * const seed, const unsigned char *ptr,
 PERL_STATIC_INLINE U32
 S_perl_hash_djb2(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
     const unsigned char * const end = (const unsigned char *)str + len;
-    U32 hash = *((U32*)seed) + (U32)len;
+    U32 hash = *((const U32*)seed) + (U32)len;
     while (str < end) {
         hash = ((hash << 5) + hash) + *str++;
     }
@@ -477,7 +477,7 @@ S_perl_hash_djb2(const unsigned char * const seed, const unsigned char *str, con
 PERL_STATIC_INLINE U32
 S_perl_hash_sdbm(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
     const unsigned char * const end = (const unsigned char *)str + len;
-    U32 hash = *((U32*)seed) + (U32)len;
+    U32 hash = *((const U32*)seed) + (U32)len;
     while (str < end) {
         hash = (hash << 6) + (hash << 16) - hash + *str++;
     }
@@ -503,7 +503,7 @@ S_perl_hash_sdbm(const unsigned char * const seed, const unsigned char *str, con
 PERL_STATIC_INLINE U32
 S_perl_hash_one_at_a_time(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
     const unsigned char * const end = (const unsigned char *)str + len;
-    U32 hash = *((U32*)seed) + (U32)len;
+    U32 hash = *((const U32*)seed) + (U32)len;
     while (str < end) {
         hash += *str++;
         hash += (hash << 10);
@@ -518,7 +518,7 @@ S_perl_hash_one_at_a_time(const unsigned char * const seed, const unsigned char
 PERL_STATIC_INLINE U32
 S_perl_hash_one_at_a_time_hard(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
     const unsigned char * const end = (const unsigned char *)str + len;
-    U32 hash = *((U32*)seed) + (U32)len;
+    U32 hash = *((const U32*)seed) + (U32)len;
     
     while (str < end) {
         hash += (hash << 10);
@@ -553,7 +553,7 @@ S_perl_hash_one_at_a_time_hard(const unsigned char * const seed, const unsigned
 PERL_STATIC_INLINE U32
 S_perl_hash_old_one_at_a_time(const unsigned char * const seed, const unsigned char *str, const STRLEN len) {
     const unsigned char * const end = (const unsigned char *)str + len;
-    U32 hash = *((U32*)seed);
+    U32 hash = *((const U32*)seed);
     while (str < end) {
         hash += *str++;
         hash += (hash << 10);
@@ -581,7 +581,7 @@ S_perl_hash_murmur_hash_64a (const unsigned char * const seed, const unsigned ch
 {
         const U64 m = UINT64_C(0xc6a4a7935bd1e995);
         const int r = 47;
-        U64 h = *((U64*)seed) ^ len;
+        U64 h = *((const U64*)seed) ^ len;
         const U64 * data = (const U64 *)str;
         const U64 * end = data + (len/8);
         const unsigned char * data2;
-- 
2.7.4

@p5pRT
Copy link
Author

p5pRT commented Nov 24, 2016

@p5pRT
Copy link
Author

p5pRT commented Nov 24, 2016

From @ppisar

0003-Fix-const-correctnes-for-utf8_hop-functions.patch
From e7771e5dd2682477edae807e0717f28f0b441679 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Petr=20P=C3=ADsa=C5=99?= <ppisar@redhat.com>
Date: Thu, 24 Nov 2016 17:58:15 +0100
Subject: [PATCH 3/3] Fix const correctnes for utf8_hop functions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

GCC -Wcast-qual option reports a const violation in utf8_hop
functions. They take a pointer to constant data but returns pointer to
non-constant data.

This patch makes the return value a constant. This changes API.

Another option would be to change remove const from the functions
input. But that would loose optimization posibilities and it would
utf8_hop functions would appear as they could modify the input and
that is not true.

Signed-off-by: Petr Písař <ppisar@redhat.com>
---
 embed.fnc |  8 ++++----
 inline.h  | 16 ++++++++--------
 proto.h   |  8 ++++----
 3 files changed, 16 insertions(+), 16 deletions(-)

diff --git a/embed.fnc b/embed.fnc
index 4743aed..6a7aa3c 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -1733,10 +1733,10 @@ Ap	|U8*	|utf16_to_utf8	|NN U8* p|NN U8 *d|I32 bytelen|NN I32 *newlen
 Ap	|U8*	|utf16_to_utf8_reversed|NN U8* p|NN U8 *d|I32 bytelen|NN I32 *newlen
 AdpPR	|STRLEN	|utf8_length	|NN const U8* s|NN const U8 *e
 AipdPR	|IV	|utf8_distance	|NN const U8 *a|NN const U8 *b
-AipdPRn	|U8*	|utf8_hop	|NN const U8 *s|SSize_t off
-AipdPRn	|U8*	|utf8_hop_back|NN const U8 *s|SSize_t off|NN const U8 *start
-AipdPRn	|U8*	|utf8_hop_forward|NN const U8 *s|SSize_t off|NN const U8 *end
-AipdPRn	|U8*	|utf8_hop_safe	|NN const U8 *s|SSize_t off|NN const U8 *start|NN const U8 *end
+AipdPRn	|const U8*	|utf8_hop	|NN const U8 *s|SSize_t off
+AipdPRn	|const U8*	|utf8_hop_back|NN const U8 *s|SSize_t off|NN const U8 *start
+AipdPRn	|const U8*	|utf8_hop_forward|NN const U8 *s|SSize_t off|NN const U8 *end
+AipdPRn	|const U8*	|utf8_hop_safe	|NN const U8 *s|SSize_t off|NN const U8 *start|NN const U8 *end
 ApMd	|U8*	|utf8_to_bytes	|NN U8 *s|NN STRLEN *len
 Apd	|int	|bytes_cmp_utf8	|NN const U8 *b|STRLEN blen|NN const U8 *u \
 				|STRLEN ulen
diff --git a/inline.h b/inline.h
index 346dcdc..b00ece4 100644
--- a/inline.h
+++ b/inline.h
@@ -896,7 +896,7 @@ on the first byte of character or just after the last byte of a character.
 =cut
 */
 
-PERL_STATIC_INLINE U8 *
+PERL_STATIC_INLINE const U8 *
 Perl_utf8_hop(const U8 *s, SSize_t off)
 {
     PERL_ARGS_ASSERT_UTF8_HOP;
@@ -916,7 +916,7 @@ Perl_utf8_hop(const U8 *s, SSize_t off)
 		s--;
 	}
     }
-    return (U8 *)s;
+    return s;
 }
 
 /*
@@ -936,7 +936,7 @@ Will not exceed this limit even if the string is not valid "UTF-8".
 =cut
 */
 
-PERL_STATIC_INLINE U8 *
+PERL_STATIC_INLINE const U8 *
 Perl_utf8_hop_forward(const U8 *s, SSize_t off, const U8 *end)
 {
     PERL_ARGS_ASSERT_UTF8_HOP_FORWARD;
@@ -951,11 +951,11 @@ Perl_utf8_hop_forward(const U8 *s, SSize_t off, const U8 *end)
     while (off--) {
         STRLEN skip = UTF8SKIP(s);
         if ((STRLEN)(end - s) <= skip)
-            return (U8 *)end;
+            return end;
         s += skip;
     }
 
-    return (U8 *)s;
+    return s;
 }
 
 /*
@@ -975,7 +975,7 @@ Will not exceed this limit even if the string is not valid "UTF-8".
 =cut
 */
 
-PERL_STATIC_INLINE U8 *
+PERL_STATIC_INLINE const U8 *
 Perl_utf8_hop_back(const U8 *s, SSize_t off, const U8 *start)
 {
     PERL_ARGS_ASSERT_UTF8_HOP_BACK;
@@ -993,7 +993,7 @@ Perl_utf8_hop_back(const U8 *s, SSize_t off, const U8 *start)
             s--;
     }
     
-    return (U8 *)s;
+    return s;
 }
 
 /*
@@ -1011,7 +1011,7 @@ Will not exceed those limits even if the string is not valid "UTF-8".
 =cut
 */
 
-PERL_STATIC_INLINE U8 *
+PERL_STATIC_INLINE const U8 *
 Perl_utf8_hop_safe(const U8 *s, SSize_t off, const U8 *start, const U8 *end)
 {
     PERL_ARGS_ASSERT_UTF8_HOP_SAFE;
diff --git a/proto.h b/proto.h
index 5ff6bfe..f1aab46 100644
--- a/proto.h
+++ b/proto.h
@@ -3494,25 +3494,25 @@ PERL_STATIC_INLINE IV	Perl_utf8_distance(pTHX_ const U8 *a, const U8 *b)
 #define PERL_ARGS_ASSERT_UTF8_DISTANCE	\
 	assert(a); assert(b)
 
-PERL_STATIC_INLINE U8*	Perl_utf8_hop(const U8 *s, SSize_t off)
+PERL_STATIC_INLINE const U8*	Perl_utf8_hop(const U8 *s, SSize_t off)
 			__attribute__warn_unused_result__
 			__attribute__pure__;
 #define PERL_ARGS_ASSERT_UTF8_HOP	\
 	assert(s)
 
-PERL_STATIC_INLINE U8*	Perl_utf8_hop_back(const U8 *s, SSize_t off, const U8 *start)
+PERL_STATIC_INLINE const U8*	Perl_utf8_hop_back(const U8 *s, SSize_t off, const U8 *start)
 			__attribute__warn_unused_result__
 			__attribute__pure__;
 #define PERL_ARGS_ASSERT_UTF8_HOP_BACK	\
 	assert(s); assert(start)
 
-PERL_STATIC_INLINE U8*	Perl_utf8_hop_forward(const U8 *s, SSize_t off, const U8 *end)
+PERL_STATIC_INLINE const U8*	Perl_utf8_hop_forward(const U8 *s, SSize_t off, const U8 *end)
 			__attribute__warn_unused_result__
 			__attribute__pure__;
 #define PERL_ARGS_ASSERT_UTF8_HOP_FORWARD	\
 	assert(s); assert(end)
 
-PERL_STATIC_INLINE U8*	Perl_utf8_hop_safe(const U8 *s, SSize_t off, const U8 *start, const U8 *end)
+PERL_STATIC_INLINE const U8*	Perl_utf8_hop_safe(const U8 *s, SSize_t off, const U8 *start, const U8 *end)
 			__attribute__warn_unused_result__
 			__attribute__pure__;
 #define PERL_ARGS_ASSERT_UTF8_HOP_SAFE	\
-- 
2.7.4

@p5pRT
Copy link
Author

p5pRT commented Nov 24, 2016

From zefram@fysh.org

Petr Pisar wrote​:

                                              I cannot see a way how to

get rid of the warnings there without changing prototype. I looked into glibc
sources how char *strchr(const char *, int) is solved. It isn't. It also
produces many warnings. Here I doubt about helpfullness of the -Wcast-qual
warnings.

Indeed. We're not going to want to change prototypes here, for the same
reasons that the strchr() prototype is appropriate as it is. The code
is correct, and there's no reasonable way to avoid these warnings.

Your first two patches seem OK. We're happy with that kind of localised
change to avoid warnings.

-zefram

@p5pRT
Copy link
Author

p5pRT commented Nov 24, 2016

The RT System itself - Status changed from 'new' to 'open'

@p5pRT
Copy link
Author

p5pRT commented Nov 25, 2016

From @ppisar

On Thu, Nov 24, 2016 at 01​:05​:58PM -0800, Zefram via RT wrote​:

Indeed. We're not going to want to change prototypes here, for the same
reasons that the strchr() prototype is appropriate as it is. The code
is correct, and there's no reasonable way to avoid these warnings.

I searched Perl sources to see how utf8_hop is used and it's realy mix of all
the ways.

Should I bother with compiler pragmata to silent the warning? GCC-specific
code would be​:

  char *foo(const char *a) {
  #if defined __GNUC__
  #pragma GCC diagnostic push
  #pragma GCC diagnostic ignored "-Wcast-qual"
  #endif
  return (char *)a;
  #if defined __GNUC__
  #pragma GCC diagnostic pop
  #endif
  }

I do not want to clutter the sources.

Your first two patches seem OK. We're happy with that kind of localised
change to avoid warnings.

Thanks for reviewing the patches.

-- Petr

@p5pRT
Copy link
Author

p5pRT commented Nov 28, 2016

From @ppisar

On Fri, Nov 25, 2016 at 10​:14​:41AM -0500, Andy Lester wrote​:

Should I bother with compiler pragmata to silent the warning?
[...]
I would be in favor of doing that if it was wrapped in a macro. It would be
good to have it explicitly obvious that we are casting away const, so that
we can have the compiler flag the ones we don't mean to cast away.

Me either. I searched the sources again and I found there had already been
GCC_DIAG_IGNORE and GCC_DIAG_RESTORE macros for that.

The attached patch that replaces the third patch from the original patch set
uses them to silence the warnings in utf8_hop functions.

-- Petr

@p5pRT
Copy link
Author

p5pRT commented Nov 28, 2016

From @ppisar

0003-Silent-const-correctnes-warnings-in-utf8_hop-functio.patch
From 4c74b8f309939863c82124e3ebe686f96cceb9cc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Petr=20P=C3=ADsa=C5=99?= <ppisar@redhat.com>
Date: Mon, 28 Nov 2016 13:06:24 +0100
Subject: [PATCH 3/3] Silent const correctnes warnings in utf8_hop functions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

GCC -Wcast-qual option reports a const violation in utf8_hop
functions. They take a pointer to constant data but returns pointer to
non-constant data.

It's impossible to fix this without changing their prototype.
Therefore this patch asks a compiler to ignore the violations.

Signed-off-by: Petr Písař <ppisar@redhat.com>
---
 inline.h | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/inline.h b/inline.h
index 346dcdc..acd19e5 100644
--- a/inline.h
+++ b/inline.h
@@ -916,7 +916,9 @@ Perl_utf8_hop(const U8 *s, SSize_t off)
 		s--;
 	}
     }
+    GCC_DIAG_IGNORE(-Wcast-qual);
     return (U8 *)s;
+    GCC_DIAG_RESTORE;
 }
 
 /*
@@ -950,12 +952,17 @@ Perl_utf8_hop_forward(const U8 *s, SSize_t off, const U8 *end)
 
     while (off--) {
         STRLEN skip = UTF8SKIP(s);
-        if ((STRLEN)(end - s) <= skip)
+        if ((STRLEN)(end - s) <= skip) {
+            GCC_DIAG_IGNORE(-Wcast-qual);
             return (U8 *)end;
+            GCC_DIAG_RESTORE;
+        }
         s += skip;
     }
 
+    GCC_DIAG_IGNORE(-Wcast-qual);
     return (U8 *)s;
+    GCC_DIAG_RESTORE;
 }
 
 /*
@@ -993,7 +1000,9 @@ Perl_utf8_hop_back(const U8 *s, SSize_t off, const U8 *start)
             s--;
     }
     
+    GCC_DIAG_IGNORE(-Wcast-qual);
     return (U8 *)s;
+    GCC_DIAG_RESTORE;
 }
 
 /*
-- 
2.7.4

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2016

From @tonycoz

On Mon, 28 Nov 2016 04​:22​:12 -0800, ppisar wrote​:

On Fri, Nov 25, 2016 at 10​:14​:41AM -0500, Andy Lester wrote​:

Should I bother with compiler pragmata to silent the warning?
[...]
I would be in favor of doing that if it was wrapped in a macro. It
would be
good to have it explicitly obvious that we are casting away const, so
that
we can have the compiler flag the ones we don't mean to cast away.

Me either. I searched the sources again and I found there had already
been
GCC_DIAG_IGNORE and GCC_DIAG_RESTORE macros for that.

The attached patch that replaces the third patch from the original
patch set
uses them to silence the warnings in utf8_hop functions.

Thanks, applied the first two of the original patches and this new patch as 463ddf3, 9f2eed9 and de97954.

Tony

@p5pRT
Copy link
Author

p5pRT commented Nov 30, 2016

@tonycoz - Status changed from 'open' to 'pending release'

@p5pRT
Copy link
Author

p5pRT commented May 30, 2017

From @khwilliamson

Thank you for filing this report. You have helped make Perl better.

With the release today of Perl 5.26.0, this and 210 other issues have been
resolved.

Perl 5.26.0 may be downloaded via​:
https://metacpan.org/release/XSAWYERX/perl-5.26.0

If you find that the problem persists, feel free to reopen this ticket.

@p5pRT
Copy link
Author

p5pRT commented May 30, 2017

@khwilliamson - Status changed from 'pending release' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant