开发者

Tiny snippet for converting 4 hex characters to an integer in C

开发者 https://www.devze.com 2023-01-16 03:19 出处:网络
I need to parse 开发者_Go百科strings of four hex characters to an integer. The characters appear inside a longer string, and there are no separators - I just know the offset they can be found in. The

I need to parse 开发者_Go百科strings of four hex characters to an integer. The characters appear inside a longer string, and there are no separators - I just know the offset they can be found in. The hex characters are case insensitive. Example with offset 3:

"foo10a4bar" -> 4260

I'm looking for a snippet that is

  • Short (too much code always creates complexity)
  • Simple (simple to understand and verify that it is correct)
  • Safe (invalid input is detected and signalled, no potential memory problems)

I'm a bit leery of using the 'sscanf' family of functions for this, but if there's a safe ANSI C solution using them, they can be used.


strtol is simple with good error handling:

const int OFFSET = 3, LEN = 4;
char hex[LEN + 1];
int i;
for(i = 0; i < LEN && str[OFFSET + i]; i++)
{
  hex[i] = str[OFFSET + i];
  if(!isxdigit((unsigned char) hex[i]))
  {
    // signal error, return
  }
}
if(i != LEN)
{
  // signal error, return
}
hex[LEN] = '\0';
char *end;
int result = (int) strtol(hex, &end, 16);
if(end != hex + LEN)
{
  // signal error, return
}


It's usually best to use standard functions where you can, to get concise and simple code:

#define HEXLEN 4

long extract_hex(const char *src, size_t offset)
{
    char hex[HEXLEN + 1] = { 0 };
    long val;

    if (strlen(src) < offset + HEXLEN)
        return -1;

    memcpy(hex, src + offset, HEXLEN);

    if (strspn(hex, "0123456789AaBbCcDdEeFf") < HEXLEN)
        return -1;

    errno = 0;
    val = strtol(hex, NULL, 16);

    /* Out of range - can't occur unless HEXLEN > 7 */
    if (errno)
        return -1;

    return val;
}


Here's my attempt

#include <assert.h>

static int h2d(char c) {
    int x;
    switch (c) {
        default: x = -1; break; /* invalid hex digit */
        case '0': x = 0; break;
        case '1': x = 1; break;
        case '2': x = 2; break;
        /* ... */
        case 'E': case 'e': x = 14; break;
        case 'F': case 'f': x = 15; break;
    }
    return x;
}

int hex4(const char *src, int offset) {
    int tmp, val = 0;
    tmp = h2d(*(src+offset+0)); assert(tmp >= 0); val += tmp << 12;
    tmp = h2d(*(src+offset+1)); assert(tmp >= 0); val += tmp << 8;
    tmp = h2d(*(src+offset+2)); assert(tmp >= 0); val += tmp << 4;
    tmp = h2d(*(src+offset+3)); assert(tmp >= 0); val += tmp;
    return val;
}

Of course, instead of assert use your preferred method of validation!

And you can use it like this

int val = hex4("foo10a4bar", 3);


Here's an alternative based on character arithmetic:

int hexdigits(char *str, int ndigits)
{
    int i;
    int n = 0;
    for (i=0; i<ndigits; ++i) {
        int d = *str++ - '0';
        if (d > 9 || d < 0)
            d += '0' - 'A' + 10;
        if (d > 15 || d < 0)
            d += 'A' - 'a';
        if (d > 15 || d < 0)
            return -1;
        n <<= 4;
        n |= d;
    }
    return n;
}

It should handle digits in both cases, and work for both ASCII and EBCDIC. Using it for more than 7 digits invites integer overflow, and may make the use of -1 as an error value indistinguishable from a valid conversion.

Just call it with the offset added to the base string: e.g. w = hexdigits(buf+3, 4); for the suggested offset of 3 chars into a string stored in buf.

Edit: Here's a version with fewer conditions that is guaranteed to work for ASCII. I'm reasonably certain it will work for EBCDIC as well, but don't have any text of that flavor laying around to prove it.

Also, I fixed a stupid oversight and made the accumulator an int instead of unsigned short. It wouldn't affect the 4-digit case, but it made it overflow at only 16-bit numbers instead of the full capacity of an int.

int hexdigits2(char *str, int ndigits)
{
    int i;
    int n = 0;
    for (i=0; i<ndigits; ++i) {
        unsigned char d = *str++ - '0';
        if (d > 9)
            d += '0' - 'A' + 10;
        if (d > 15)
            d += 'A' - 'a';
        if (d > 15)
            return -1;
        n <<= 4;
        n |= d;
    }
    return n;
}

Usage is the same as the earlier version, but the generated code could be a bit smaller.


Here's my own try at it now that I thought about it for a moment - I'm not at all sure this is the best, so I will wait a while and then accept the answer that seems best to me.

    val = 0;
    for (i = 0; i < 4; i++) {
        val <<= 4;
        if (ptr[offset+i] >= '0' && ptr[offset+i] <= '9')
            val += ptr[offset+i] - '0';
        else if (ptr[offset+i] >= 'a' && ptr[offset+i] <= 'f')
            val += (ptr[offset+i] - 'a') + 10;
        else if (ptr[offset+i] >= 'A' && ptr[offset+i] <= 'F')
            val += (ptr[offset+i] - 'A') + 10;
        else {
            /* signal error */
        }
    }


/* evaluates the first containing hexval in s */
int evalonehexFromStr( const char *s, unsigned long *val )
{
  while( *s )
   if( 1==sscanf(s++, "%04lx", val ) )
    return 1;
  return 0;
}

It works for exactly 4 hex-digits, eg:

unsigned long result;
if( evalonehexFromStr("foo10a4bar", &result) )
  printf("\nOK - %lu", result);

If you need other hex-digit sizes, replace "4" to your size or take "%lx" for any hexval for values up to MAX_ULONG.


Code

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
    int offset = atoi(argv[2]);
    argv[1][offset + 4] = '\0';
    printf("%lu\n", strtol(argv[1] + offset, NULL, 0x10));
}

Usage

matt@stanley:$ make small_hex_converter
cc     small_hex_converter.c   -o small_hex_converter
matt@stanley:$ ./small_hex_converter f0010a4bar 3
4260
0

精彩评论

暂无评论...
验证码 换一张
取 消

关注公众号