I've just recently started learning C, after a year learning and writing in Java. I have a lot of catching up to do when it comes to memory, variables and stuff like that. I have a few questions, any answer would really help:
- "A value of type char always occupies one byte" - does that mean that char is always exactly 8 bits long?
- What characters-sets is used in modern computers? Is there a difference between different OS?
- What is the difference between a signed char and an unsigned char? And is the character represented by a number machine depended?
- what is the difference between char *ch = "text" and char ch[] = "text"
- What is the benefit using unsigned variables?
- C defines the minimum sortage size for standard types. Who defines the maximum, and how is it determined?
- Today, when (for example) do we use hexadecima开发者_如何学运维l base, or any other base other then binary?
- Is memory align machine dependent? And if it is, is there any common method in modern-days computers?
No. (The definition of byte in the context of C is addressable unit of data storage large enough to hold any member of the basic character set of the execution environment, it just has to be at least 8 bits, I've used C implementation with 9 bit bytes and know of some which had 16 -- DSP -- or 64 bits bytes -- Cray)
OS other than Windows tend to be relatively neutral about charset. There is a trend to go to Unicode but ISO-8851-X in Europe and various wide character sets in Asia still have their followers.
It is locale dependant.
First define a pointeur to char initialized with a pointeur pointing to a string "text", second definne an array of 5 chars initialized with "text\0"
There is no maximum. An implementation use what it deems usefull. Most often than not by respecting an ABI determined by the OS (some OS have several ABIs which can have types of different size).
convenience for human.
size of the wider member which is a basic type.
"A value of type char always occupies one byte" - does that mean that char is always exactly 8 bits long?
No, it means that for the C language, a byte is not always 8 bit long. There is a macro CHAR_BIT
in stdint.h
to query how much bits there is in a byte. CHAR_BIT
is at least 8.
"What characters-sets is used in modern computers? Is there a difference between different OS?"
not relevant for the C language. neither C nor its standard library has a notion of character sets.
"What is the difference between a signed char and an unsigned char? And is the character represented by a number machine depended?"
well, one is signed, the other is unsigned. Note that char
is either signed or unsigned, depending on your platform. So there are really three variants of char
. The C language doesn't define to which number characters correspond.
"what is the difference between char *ch = "text" and char ch[] = "text""
the first one will trigger undefined behavior when you'll try to modify it (and should be const char *ch = "text"
), the second will not. Indeed, the first construct makes a constant string (typically into the .data section) and gives you a pointer to it. The second construct defines a 5 char long array and fills it with 't' 'e' 'x' 't' '\0'.
"What is the benefit using unsigned variables?"
the arithmetic of unsigned variables is guaranteed to be modulo 2^n arithmetic. There are less requirements on signed types (eg. signed overflow has undefined behavior)
"C defines the minimum sortage size for standard types. Who defines the maximum, and how is it determined?"
nobody. There are convenience macros in stdint.h
if you need the maximum for your implementation.
"Today, when (for example) do we use hexadecimal base, or any other base other then binary?"
not relevant to C language
"Is memory align machine dependent? And if it is, is there any common method in modern-days computers?"
yes it is machine dependent, and this shouldn't be relevant to your programming. The C languages guarantees nothing about alignment, except that it properly aligns automatic variables and structure members.
1) yes, if a byte is 8 bits
2) there are ascii and unicode, as a developer you can choose (because some functions accept unicode params and some others want ascii) but c's chars are ascii(z)
3) signed char = from -127 to +128, unsigned char = from 0 to 255. No, ascii or unicode
4) the first one is a pointer (and the string is stored on the heap), the second one is an array (and the string is stored on the stack)
5) dunno
6) compiler dependent I guess
7) you choose. The number represented is the same. if you feel more comfortable using octal base than decimal there's no problem
8) yes. it depends on the word size of that machine, usually 32 or 64 bit.
精彩评论