Solidity in Depth¶
This section should provide you with all you need to know about Solidity. If something is missing here, please contact us on Gitter or make a pull request on Github.
Layout of a Solidity Source File¶
Source files can contain an arbitrary number of contract definitions and include directives.
Importing other Source Files¶
Syntax and Semantics¶
Solidity supports import statements that are very similar to those available in JavaScript (from ES6 on), although Solidity does not know the concept of a “default export”.
At a global level, you can use import statements of the following form:
import "filename";
...will import all global symbols from “filename” (and symbols imported there) into the current global scope (different than in ES6 but backwards-compatible for Solidity).
import * as symbolName from "filename";
...creates a new global symbol symbolName whose members are all the global symbols from “filename”.
import {symbol1 as alias, symbol2} from "filename";
...creates new global symbols alias and symbol2 which reference symbol1 and symbol2 from “filename”, respectively.
Another syntax is not part of ES6, but probably convenient:
import "filename" as symbolName;
...is equivalent to import * as symbolName from “filename”;.
Paths¶
In the above, filename is always treated as a path with / as directory separator, . as the current and .. as the parent directory. Path names that do not start with . are treated as absolute paths.
To import a file x from the same directory as the current file, use import ”./x” as x;. If you use import “x” as x; instead, a different file could be referenced (in a global “include directory”).
It depends on the compiler (see below) how to actually resolve the paths. In general, the directory hierarchy does not need to strictly map onto your local filesystem, it can also map to resources discovered via e.g. ipfs, http or git.
Use in actual Compilers¶
When the compiler is invoked, it is not only possible to specify how to discover the first element of a path, but it is possible to specify path prefix remappings so that e.g. github.com/ethereum/dapp-bin/library is remapped to /usr/local/dapp-bin/library and the compiler will read the files from there. If remapping keys are prefixes of each other, the longest is tried first. This allows for a “fallback-remapping” with e.g. “” maps to “/usr/local/include/solidity”.
solc:
For solc (the commandline compiler), these remappings are provided as key=value arguments, where the =value part is optional (and defaults to key in that case). All remapping values that are regular files are compiled (including their dependencies). This mechanism is completely backwards-compatible (as long as no filename contains a =) and thus not a breaking change.
So as an example, if you clone github.com/ethereum/dapp-bin/ locally to /usr/local/dapp-bin, you can use the following in your source file:
import "github.com/ethereum/dapp-bin/library/iterable_mapping.sol" as it_mapping;
and then run the compiler as
solc github.com/ethereum/dapp-bin/=/usr/local/dapp-bin/ source.sol
Note that solc only allows you to include files from certain directories: They have to be in the directory (or subdirectory) of one of the explicitly specified source files or in the directory (or subdirectory) of a remapping target. If you want to allow direct absolute includes, just add the remapping =/.
If there are multiple remappings that lead to a valid file, the remapping with the longest common prefix is chosen.
browser-solidity:
The browser-based compiler provides an automatic remapping for github and will also automatically retrieve the file over the network: You can import the iterable mapping by e.g. import “github.com/ethereum/dapp-bin/library/iterable_mapping.sol” as it_mapping;.
Other source code providers may be added in the future.
Comments¶
Single-line comments (//) and multi-line comments (/.../) are possible.
// This is a single-line comment. /* This is a multi-line comment. */
There are special types of comments called natspec comments (documentation yet to be written). These are introduced by triple-slash comments (///) or using double asterisks (/* ... /). Right in front of function declarations or statements, you can use doxygen-style tags inside them to document functions, annotate conditions for formal verification and provide a confirmation text that is shown to users if they want to invoke a function.
Structure of a Contract¶
Contracts in Solidity are similar to classes in object-oriented languages. Each contract can contain declarations of
State Variables
,
Functions
,
Function Modifiers
,
Events
,
Structs Types
and
Enum Types
. Furthermore, contracts can inherit from other contracts.
State Variables¶
State variables are values which are permanently stored in contract storage.
contract SimpleStorage { uint storedData; // State variable // ... }
See the Types section for valid state variable types and Visibility and Accessors for possible choices for visibility.
Functions¶
Functions are the executable units of code within a contract.
contract SimpleAuction { function bid() { // Function // ... } }
Function Calls can happen internally or externally and have different levels of visibility (Visibility and Accessors) towards other contracts.
Function Modifiers¶
Function modifiers can be used to amend the semantics of functions in a declarative way (see Function Modifiers in contracts section).
contract Purchase { address public seller; modifier onlySeller() { // Modifier if (msg.sender != seller) throw; _ } function abort() onlySeller { // Modifier usage // ... } } in the section on contracts for a more in-depth explanation.
Events¶
Events are convenience interfaces with the EVM logging facilities.
contract SimpleAuction { event HighestBidIncreased(address bidder, uint amount); // Event function bid() { // ... HighestBidIncreased(msg.sender, msg.value); // Triggering event } }
See Events in contracts section for information on how events are declared and can be used from within a dapp.
Structs Types¶
Structs are custom defined types that can group several variables (see Structs in types section).
contract Ballot { struct Voter { // Struct uint weight; bool voted; address delegate; uint vote; } }
Enum Types¶
Enums can be used to create custom types with a finite set of values (see Enums in types section).
contract Purchase { enum State { Created, Locked, Inactive } // Enum }
Types¶
Solidity is a statically typed language, which means that the type of each variable (state and local) needs to be specified (or at least known - see Type Deduction below) at compile-time. Solidity provides several elementary types which can be combined to complex types.
Value Types¶
The following types are also called value types because variables of these types will always be passed by value, i.e. they are always copied when they are used as function arguments or in assignments.
Booleans¶
bool: The possible values are constants true and false.
Operators:
- ! (logical negation)
- && (logical conjunction, “and”)
- || (logical disjunction, “or”)
- == (equality)
- != (inequality)
The operators || and && apply the common short-circuiting rules. This means that in the expression f(x) || g(y), if f(x) evaluates to true, g(y) will not be evaluated even if it may have side-effects.
Integers¶
int• / uint•: Signed and unsigned integers of various sizes. Keywords uint8 to uint256 in steps of 8 (unsigned of 8 up to 256 bits) and int8 to int256. uint and int are aliases for uint256 and int256, respectively.
Operators:
- Comparisons: <=, <cite><, ==, !=, >=, (evaluate to bool)
- Bit operators: &, |, ^ (bitwise exclusive or), ~ (bitwise negation)
- Arithmetic operators: +, -, unary -, unary +, *, /, % (remainder), ** (exponentiation)
Division always truncates (it just maps to the DIV opcode of the EVM), but it does not truncate if both operators are literals (or literal expressions).
Address¶
address: Holds a 20 byte value (size of an Ethereum address). Address types also have members(see Functions on addresses) and serve as base for all contracts.
Operators:
- <=, <cite><, ==, !=, >= and
Members of Addresses¶
- balance and send
It is possible to query the balance of an address using the property balance and to send Ether (in units of wei) to an address using the send function:
address x = 0x123; address myAddress = this; if (x.balance < 10 && myAddress.balance >= 10) x.send(10);
Note
If x is a contract address, its code (more specifically: its fallback function, if present) will be executed together with the send call (this is a limitation of the EVM and cannot be prevented). If that execution runs out of gas or fails in any way, the Ether transfer will be reverted. In this case, send returns false.
- call, callcode and delegatecall
Furthermore, to interface with contracts that do not adhere to the ABI, the function call is provided which takes an arbitrary number of arguments of any type. These arguments are padded to 32 bytes and concatenated. One exception is the case where the first argument is encoded to exactly four bytes. In this case, it is not padded to allow the use of function signatures here.
address nameReg = 0x72ba7d8e73fe8eb666ea66babc8116a41bfb10e2; nameReg.call("register", "MyName"); nameReg.call(bytes4(sha3("fun(uint256)")), a);
call returns a boolean indicating whether the invoked function terminated (true) or caused an EVM exception (false). It is not possible to access the actual data returned (for this we would need to know the encoding and size in advance).
In a similar way, the function delegatecall can be used: The difference is that only the code of the given address is used, all other aspects (storage, balance, ...) are taken from the current contract. The purpose of delegatecall is to use library code which is stored in another contract. The user has to ensure that the layout of storage in both contracts is suitable for delegatecall to be used. Prior to homestead, only a limited variant called callcode was available that did not provide access to the original msg.sender and msg.value values.
All three functions call, delegatecall and callcode are very low-level functions and should only be used as a last resort as they break the type-safety of Solidity.
Note
All contracts inherit the members of address, so it is possible to query the balance of the current contract using this.balance.
Fixed-size byte arrays¶
bytes1, bytes2, bytes3, ..., bytes32. byte is an alias for bytes1.
Operators:
- Comparisons: <=, <cite><, ==, !=, >=, (evaluate to bool)
- Bit operators: &, |, ^ (bitwise exclusive or), ~ (bitwise negation)
- Index access: If x is of type bytesI, then x[k] for 0 <= k < I returns the k th byte (read-only).
Members:
- .length yields the fixed length of the byte array (read-only).
Dynamically-sized byte array¶
- bytes:
- Dynamically-sized byte array, see Arrays. Not a value-type!
- string:
- Dynamically-sized UTF8-encoded string, see Arrays. Not a value-type!
As a rule of thumb, use bytes for arbitrary-length raw byte data and string for arbitrary-length string (utf-8) data. If you can limit the length to a certain number of bytes, always use one of bytes1 to bytes32 because they are much cheaper.
Integer Literals¶
Integer Literals are arbitrary precision integers until they are used together with a non-literal. In var x = 1 - 2;, for example, the value of 1 - 2 is -1, which is assigned to x and thus x receives the type int8 – the smallest type that contains -1, although the natural types of 1 and 2 are actually uint8.
It is even possible to temporarily exceed the maximum of 256 bits as long as only integer literals are used for the computation: var x = (0xffffffffffffffffffff 0xffffffffffffffffffff) 0; Here, x will have the value 0 and thus the type uint8.
Warning
Divison on integer literals used to truncate in earlier versions, but it will actually convert into a rational number in the future, i.e. 1/2 is not equal to 0, but to 0.5.
String Literals¶
String Literals are written with double quotes (“abc”). As with integer literals, their type can vary, but they are implicitly convertible to bytes• if they fit, to bytes and to string.
Enums¶
Enums are one way to create a user-defined type in Solidity. They are explicitly convertible to and from all integer types but implicit conversion is not allowed.
contract test { enum ActionChoices { GoLeft, GoRight, GoStraight, SitStill } ActionChoices choice; ActionChoices constant defaultChoice = ActionChoices.GoStraight; function setGoStraight() { choice = ActionChoices.GoStraight; } // Since enum types are not part of the ABI, the signature of "getChoice" // will automatically be changed to "getChoice() returns (uint8)" // for all matters external to Solidity. The integer type used is just // large enough to hold all enum values, i.e. if you have more values, // `uint16` will be used and so on. function getChoice() returns (ActionChoices) { return choice; } function getDefaultChoice() returns (uint) { return uint(defaultChoice); } }
Reference Types¶
Complex types, i.e. types which do not always fit into 256 bits have to be handled more carefully than the value-types we have already seen. Since copying them can be quite expensive, we have to think about whether we want them to be stored in memory (which is not persisting) or storage (where the state variables are held).
Data location¶
Every complex type, i.e. arrays and structs, has an additional annotation, the “data location”, about whether it is stored in memory or in storage. Depending on the context, there is always a default, but it can be overridden by appending either storage or memory to the type. The default for function parameters (including return parameters) is memory, the default for local variables is storage and the location is forced to storage for state variables (obviously).
There is also a third data location, “calldata”, which is a non-modifyable non-persistent area where function arguments are stored. Function parameters (not return parameters) of external functions are forced to “calldata” and it behaves mostly like memory.
Data locations are important because they change how assignments behave: Assignments between storage and memory and also to a state variable (even from other state variables) always create an independent copy. Assignments to local storage variables only assign a reference though, and this reference always points to the state variable even if the latter is changed in the meantime. On the other hand, assignments from a memory stored reference type to another memory-stored reference type does not create a copy.
contract c { uint[] x; // the data location of x is storage // the data location of memoryArray is memory function f(uint[] memoryArray) { x = memoryArray; // works, copies the whole array to storage var y = x; // works, assigns a pointer, data location of y is storage y[7]; // fine, returns the 8th element y.length = 2; // fine, modifies x through y delete x; // fine, clears the array, also modifies y // The following does not work; it would need to create a new temporary / // unnamed array in storage, but storage is "statically" allocated: // y = memoryArray; // This does not work either, since it would "reset" the pointer, but there // is no sensible location it could point to. // delete y; g(x); // calls g, handing over a reference to x h(x); // calls h and creates an independent, temporary copy in memory } function g(uint[] storage storageArray) internal {} function h(uint[] memoryArray) {} }
Summary¶
- Forced data location:
- parameters (not return) of external functions: calldata
- state variables: storage
Default data location:
- parameters (also return) of functions: memory
- all other local variables: storage
Arrays¶
Arrays can have a compile-time fixed size or they can be dynamic. For storage arrays, the element type can be arbitrary (i.e. also other arrays, mappings or structs). For memory arrays, it cannot be a mapping and has to be an ABI type if it is an argument of a publicly-visible function.
An array of fixed size k and element type T is written as T[k], an array of dynamic size as T[]. As an example, an array of 5 dynamic arrays of uint is uint[][5] (note that the notation is reversed when compared to some other languages). To access the second uint in the third dynamic array, you use x[2][1] (indices are zero-based and access works in the opposite way of the declaration, i.e. x[2] shaves off one level in the type from the right).
Variables of type bytes and string are special arrays. A bytes is similar to byte[], but it is packed tightly in calldata. string is equal to bytes but does not allow length or index access (for now).
So bytes should always be preferred over byte[] because it is cheaper.
Note
If you want to access the byte-representation of a string s, use bytes(s).length / bytes(s)[7] = ‘x’;. Keep in mind that you are accessing the low-level bytes of the utf-8 representation, and not the individual characters!
Members¶
- length:
- Arrays have a length member to hold their number of elements. Dynamic arrays can be resized in storage (not in memory) by changing the .length member. This does not happen automatically when attempting to access elements outside the current length. The size of memory arrays is fixed (but dynamic, i.e. it can depend on runtime parameters) once they are created.
- push:
- Dynamic storage arrays and bytes (not string) have a member function called push that can be used to append an element at the end of the array. The function returns the new length.
Warning
It is not yet possible to use arrays of arrays in external functions.
Warning
Due to limitations of the EVM, it is not possible to return dynamic content from external function calls. The function f in contract C { function f() returns (uint[]) { ... } } will return something if called from web3.js, but not if called from Solidity.
The only workaround for now is to use large statically-sized arrays.
contract ArrayContract { uint[2**20] m_aLotOfIntegers; // Note that the following is not a pair of arrays but an array of pairs. bool[2][] m_pairsOfFlags; // newPairs is stored in memory - the default for function arguments function setAllFlagPairs(bool[2][] newPairs) { // assignment to a storage array replaces the complete array m_pairsOfFlags = newPairs; } function setFlagPair(uint index, bool flagA, bool flagB) { // access to a non-existing index will throw an exception m_pairsOfFlags[index][0] = flagA; m_pairsOfFlags[index][1] = flagB; } function changeFlagArraySize(uint newSize) { // if the new size is smaller, removed array elements will be cleared m_pairsOfFlags.length = newSize; } function clear() { // these clear the arrays completely delete m_pairsOfFlags; delete m_aLotOfIntegers; // identical effect here m_pairsOfFlags.length = 0; } bytes m_byteData; function byteArrays(bytes data) { // byte arrays ("bytes") are different as they are stored without padding, // but can be treated identical to "uint8[]" m_byteData = data; m_byteData.length += 7; m_byteData[3] = 8; delete m_byteData[2]; } function addFlag(bool[2] flag) returns (uint) { return m_pairsOfFlags.push(flag); } function createMemoryArray(uint size) returns (bytes) { // Dynamic memory arrays are created using `new`: uint[2][] memory arrayOfPairs = new uint[2][](size); // Create a dynamic byte array: bytes memory b = new bytes(200); for (uint i = 0; i < b.length; i++) b[i] = byte(i); return b; } }
Structs¶
Solidity provides a way to define new types in the form of structs, which is shown in the following example:
contract CrowdFunding { // Defines a new type with two fields. struct Funder { address addr; uint amount; } struct Campaign { address beneficiary; uint fundingGoal; uint numFunders; uint amount; mapping (uint => Funder) funders; } uint numCampaigns; mapping (uint => Campaign) campaigns; function newCampaign(address beneficiary, uint goal) returns (uint campaignID) { campaignID = numCampaigns++; // campaignID is return variable // Creates new struct and saves in storage. We leave out the mapping type. campaigns[campaignID] = Campaign(beneficiary, goal, 0, 0); } function contribute(uint campaignID) { Campaign c = campaigns[campaignID]; // Creates a new temporary memory struct, initialised with the given values // and copies it over to storage. // Note that you can also use Funder(msg.sender, msg.value) to initialise. c.funders[c.numFunders++] = Funder({addr: msg.sender, amount: msg.value}); c.amount += msg.value; } function checkGoalReached(uint campaignID) returns (bool reached) { Campaign c = campaigns[campaignID]; if (c.amount < c.fundingGoal) return false; c.beneficiary.send(c.amount); c.amount = 0; return true; } }
The contract does not provide the full functionality of a crowdfunding contract, but it contains the basic concepts necessary to understand structs. Struct types can be used inside mappings and arrays and they can itself contain mappings and arrays.
It is not possible for a struct to contain a member of its own type, although the struct itself can be the value type of a mapping member. This restriction is necessary, as the size of the struct has to be finite.
Note how in all the functions, a struct type is assigned to a local variable (of the default storage data location). This does not copy the struct but only stores a reference so that assignments to members of the local variable actually write to the state.
Of course, you can also directly access the members of the struct without assigning it to a local variable, as in campaigns[campaignID].amount = 0.
Mappings¶
Mapping types are declared as mapping _KeyType => _ValueType, where _KeyType can be almost any type except for a mapping and _ValueType can actually be any type, including mappings.
Mappings can be seen as hashtables which are virtually initialized such that every possible key exists and is mapped to a value whose byte-representation is all zeros. The similarity ends here, though: The key data is not actually stored in a mapping, only its sha3 hash used to look up the value.
Because of this, mappings do not have a length or a concept of a key or value being “set”.
Mappings are only allowed for state variables (or as storage reference types in internal functions).
Operators Involving LValues¶
If a is an LValue (i.e. a variable or something that can be assigned to), the following operators are available as shorthands:
a += e is equivalent to a = a + e. The operators -=, *=, /=, %=, a |=, &= and ^= are defined accordingly. a++ and a– are equivalent to a += 1 / a -= 1 but the expression itself still has the previous value of a. In contrast, –a and ++a have the same effect on a but return the value after the change.
delete¶
delete a assigns the initial value for the type to a. I.e. for integers it is equivalent to a = 0, but it can also be used on arrays, where it assigns a dynamic array of length zero or a static array of the same length with all elements reset. For structs, it assigns a struct with all members reset.
delete has no effect on whole mappings (as the keys of mappings may be arbitrary and are generally unknown). So if you delete a struct, it will reset all members that are not mappings and also recurse into the members unless they are mappings. However, individual keys and what they map to can be deleted.
It is important to note that delete a really behaves like an assignment to a, i.e. it stores a new object in a.
contract DeleteExample { uint data; uint[] dataArray; function f() { uint x = data; delete x; // sets x to 0, does not affect data delete data; // sets data to 0, does not affect x which still holds a copy uint[] y = dataArray; delete dataArray; // this sets dataArray.length to zero, but as uint[] is a complex object, also // y is affected which is an alias to the storage object // On the other hand: "delete y" is not valid, as assignments to local variables // referencing storage objects can only be made from existing storage objects. } }
Conversions between Elementary Types¶
Implicit Conversions¶
If an operator is applied to different types, the compiler tries to implicitly convert one of the operands to the type of the other (the same is true for assignments). In general, an implicit conversion between value-types is possible if it makes sense semantically and no information is lost: uint8 is convertible to uint16 and int128 to int256, but int8 is not convertible to uint256 (because uint256 cannot hold e.g. -1). Furthermore, unsigned integers can be converted to bytes of the same or larger size, but not vice-versa. Any type that can be converted to uint160 can also be converted to address.
Explicit Conversions¶
If the compiler does not allow implicit conversion but you know what you are doing, an explicit type conversion is sometimes possible:
int8 y = -3; uint x = uint(y);
At the end of this code snippet, x will have the value 0xfffff..fd (64 hex characters), which is -3 in two’s complement representation of 256 bits.
If a type is explicitly converted to a smaller type, higher-order bits are cut off:
uint32 a = 0x12345678; uint16 b = uint16(a); // b will be 0x5678 now
Type Deduction¶
For convenience, it is not always necessary to explicitly specify the type of a variable, the compiler automatically infers it from the type of the first expression that is assigned to the variable:
uint20 x = 0x123; var y = x;
Here, the type of y will be uint20. Using var is not possible for function parameters or return parameters.
Warning
The type is only deduced from the first assignment, so the loop in the following snippet is infinite, as i will have the type uint8 and any value of this type is smaller than 2000. for (var i = 0; i < 2000; i++) { ... }
Units and Globally Available Variables¶
Ether Units¶
A literal number can take a suffix of wei, finney, szabo or ether to convert between the subdenominations of Ether, where Ether currency numbers without a postfix are assumed to be “wei”, e.g. 2 ether == 2000 finney evaluates to true.
Time Units¶
Suffixes of seconds, minutes, hours, days, weeks and years after literal numbers can be used to convert between units of time where seconds are the base unit and units are considered naively in the following way:
- 1 == 1 second
- 1 minutes == 60 seconds
- 1 hours == 60 minutes
- 1 days == 24 hours
- 1 weeks = 7 days
- 1 years = 365 days
Take care if you perform calendar calculations using these units, because not every year equals 365 days and not even every day has 24 hours because of leap seconds. Due to the fact that leap seconds cannot be predicted, an exact calendar library has to be updated by an external oracle.
These suffixes cannot be applied to variables. If you want to interpret some input variable in e.g. days, you can do it in the following way:
function f(uint start, uint daysAfter) { if (now >= start + daysAfter * 1 days) { ... } }
Special Variables and Functions¶
There are special variables and functions which always exist in the global namespace and are mainly used to provide information about the blockchain.
Block and Transaction Properties¶
- block.coinbase (address): current block miner’s address
- block.difficulty (uint): current block difficulty
- block.gaslimit (uint): current block gaslimit
- block.number (uint): current block number
- block.blockhash (function(uint) returns (bytes32)): hash of the given block - only for 256 most recent blocks
- block.timestamp (uint): current block timestamp
- msg.data (bytes): complete calldata
- msg.gas (uint): remaining gas
- msg.sender (address): sender of the message (current call)
- msg.sig (bytes4): first four bytes of the calldata (i.e. function identifier)
- msg.value (uint): number of wei sent with the message
- now (uint): current block timestamp (alias for block.timestamp)
- tx.gasprice (uint): gas price of the transaction
- tx.origin (address): sender of the transaction (full call chain)
Note
The values of all members of msg, including msg.sender and msg.value can change for every external function call. This includes calls to library functions.
If you want to implement access restrictions in library functions using msg.sender, you have to manually supply the value of msg.sender as an argument.
Note
The block hashes are not available for all blocks for scalability reasons. You can only access the hashes of the most recent 256 blocks, all other values will be zero.
Mathematical and Cryptographic Functions¶
- addmod(uint x, uint y, uint k) returns (uint):
- compute (x + y) % k where the addition is performed with arbitrary precision and does not wrap around at 2**256.
- mulmod(uint x, uint y, uint k) returns (uint):
- compute (x * y) % k where the multiplication is performed with arbitrary precision and does not wrap around at 2**256.
- sha3(...) returns (bytes32):
- compute the Ethereum-SHA-3 hash of the (tightly packed) arguments
- sha256(...) returns (bytes32):
- compute the SHA-256 hash of the (tightly packed) arguments
- ripemd160(...) returns (bytes20):
- compute RIPEMD-160 hash of the (tightly packed) arguments
- ecrecover(bytes32, uint8, bytes32, bytes32) returns (address):
- recover public key from elliptic curve signature - arguments are (data, v, r, s)
In the above, “tightly packed” means that the arguments are concatenated without padding. This means that the following are all identical:
sha3("ab", "c") sha3("abc") sha3(0x616263) sha3(6382179) sha3(97, 98, 99)
If padding is needed, explicit type conversions can be used: sha3(“x00x12”) is the same as sha3(uint16(0x12)).
It might be that you run into Out-of-Gas for sha256, ripemd160 or ecrecover on a private blockchain. The reason for this is that those are implemented as so-called precompiled contracts and these contracts only really exist after they received the first message (although their contract code is hardcoded). Messages to non-existing contracts are more expensive and thus the execution runs into an Out-of-Gas error. A workaround for this problem is to first send e.g. 1 Wei to each of the contracts before you use them in your actual contracts. This is not an issue on the official or test net.
Contract Related¶
- this (current contract’s type):
- the current contract, explicitly convertible to address
- selfdestruct(address):
- destroy the current contract, sending its funds to the given address
Furthermore, all functions of the current contract are callable directly including the current function.
Expressions and Control Structures¶
Control Structures¶
Most of the control structures from C/JavaScript are available in Solidity except for switch and goto. So there is: if, else, while, for, break, continue, return, ? :, with the usual semantics known from C / JavaScript.
Parentheses can not be omitted for conditionals, but curly brances can be omitted around single-statement bodies.
Note that there is no type conversion from non-boolean to boolean types as there is in C and JavaScript, so if (1) { ... } is not valid Solidity.
Function Calls¶
Internal Function Calls¶
Functions of the current contract can be called directly (“internally”), also recursively, as seen in this nonsensical example:
contract c { function g(uint a) returns (uint ret) { return f(); } function f() returns (uint ret) { return g(7) + f(); } }
These function calls are translated into simple jumps inside the EVM. This has the effect that the current memory is not cleared, i.e. passing memory references to internally-called functions is very efficient. Only functions of the same contract can be called internally.
External Function Calls¶
The expression this.g(8); is also a valid function call, but this time, the function will be called “externally”, via a message call and not directly via jumps. Functions of other contracts have to be called externally. For an external call, all function arguments have to be copied to memory.
When calling functions of other contracts, the amount of Wei sent with the call and the gas can be specified:
contract InfoFeed { function info() returns (uint ret) { return 42; } } contract Consumer { InfoFeed feed; function setFeed(address addr) { feed = InfoFeed(addr); } function callFeed() { feed.info.value(10).gas(800)(); } }
Note that the expression InfoFeed(addr) performs an explicit type conversion stating that “we know that the type of the contract at the given address is InfoFeed” and this does not execute a constructor. We could also have used function setFeed(InfoFeed _feed) { feed = _feed; } directly. Be careful about the fact that feed.info.value(10).gas(800) only (locally) sets the value and amount of gas sent with the function call and only the parentheses at the end perform the actual call.
Named Calls and Anonymous Function Parameters¶
Function call arguments can also be given by name, in any order, and the names of unused parameters (especially return parameters) can be omitted.
contract c { function f(uint key, uint value) { ... } function g() { // named arguments f({value: 2, key: 3}); } // omitted parameters function func(uint k, uint) returns(uint) { return k; } }
Order of Evaluation of Expressions¶
The evaluation order of expressions is not specified (more formally, the order in which the children of one node in the expression tree are evaluated is not specified, but they are of course evaluated before the node itself). It is only guaranteed that statements are executed in order and short-circuiting for boolean expressions is done.
Assignment¶
Destructuring Assignments and Returning Multiple Values¶
Solidity internally allows tuple types, i.e. a list of objects of potentially different types whose size is a constant at compile-time. Those tuples can be used to return multiple values at the same time and also assign them to multiple variables (or LValues in general) at the same time:
contract C { uint[] data; function f() returns (uint, bool, uint) { return (7, true, 2); } function g() { // Declares and assigns the variables. Specifying the type explicitly is not possible. var (x, b, y) = f(); // Assigns to a pre-existing variable. (x, y) = (2, 7); // Common trick to swap values -- does not work for non-value storage types. (x, y) = (y, x); // Components can be left out (also for variable declarations). // If the tuple ends in an empty component, // the rest of the values are discarded. (data.length,) = f(); // Sets the length to 7 // The same can be done on the left side. (,data[3]) = f(); // Sets data[3] to 2 // Components can only be left out at the left-hand-side of assignments, with // one exception: (x,) = (1,); // (1,) is the only way to specify a 1-component tuple, because (1) is // equivalent to 1. } }
Complications for Arrays and Structs¶
The semantics of assignment are a bit more complicated for non-value types like arrays and structs. Assigning to a state variable always creates an independent copy. On the other hand, assigning to a local variable creates an independent copy only for elementary types, i.e. static types that fit into 32 bytes. If structs or arrays (including bytes and string) are assigned from a state variable to a local variable, the local variable holds a reference to the original state variable. A second assignment to the local variable does not modify the state but only changes the reference. Assignments to members (or elements) of the local variable do change the state.
Exceptions¶
There are some cases where exceptions are thrown automatically (see below). You can use the throw instruction to throw an exception manually. The effect of an exception is that the currently executing call is stopped and reverted (i.e. all changes to the state and balances are undone) and the exception is also “bubbled up” through Solidity function calls (exceptions are send and the low-level functions call, delegatecall and callcode, those return false in case of an exception).
Catching exceptions is not yet possible.
In the following example, we show how throw can be used to easily revert an Ether transfer and also how to check the return value of send:
contract Sharer { function sendHalf(address addr) returns (uint balance) { if (!addr.send(msg.value/2)) throw; // also reverts the transfer to Sharer return this.balance; } }
Currently, there are three situations, where exceptions happen automatically in Solidity:
- If you access an array beyond its length (i.e. x[i] where i >= x.length)
- If a function called via a message call does not finish properly (i.e. it runs out of gas or throws an exception itself).
- If a non-existent function on a library is called or Ether is sent to a library.
Internally, Solidity performs an “invalid jump” when an exception is thrown and thus causes the EVM to revert all changes made to the state. The reason for this is that there is no safe way to continue execution, because an expected effect did not occur. Because we want to retain the atomicity of transactions, the safest thing to do is to revert all changes and make the whole transaction (or at least call) without effect.
Contracts¶
Contracts in Solidity are what classes are in object oriented languages. They contain persistent data in state variables and functions that can modify these variables. Calling a function on a different contract (instance) will perform an EVM function call and thus switch the context such that state variables are inaccessible.
Creating Contracts¶
Contracts can be created “from outside” or from Solidity contracts. When a contract is created, its constructor (a function with the same name as the contract) is executed once.
From web3.js, i.e. the JavaScript API, this is done as follows:
// The json abi array generated by the compiler var abiArray = [ { "inputs":[ {"name":"x","type":"uint256"}, {"name":"y","type":"uint256"} ], "type":"constructor" }, { "constant":true, "inputs":[], "name":"x", "outputs":[{"name":"","type":"bytes32"}], "type":"function" } ]; var MyContract = web3.eth.contract(abiArray); // deploy new contract var contractInstance = MyContract.new( 10, {from: myAccount, gas: 1000000} );
Internally, constructor arguments are passed after the code of the contract itself, but you do not have to care about this if you use web3.js.
If a contract wants to create another contract, the source code (and the binary) of the created contract has to be known to the creator. This means that cyclic creation dependencies are impossible.
contract OwnedToken { // TokenCreator is a contract type that is defined below. // It is fine to reference it as long as it is not used // to create a new contract. TokenCreator creator; address owner; bytes32 name; // This is the constructor which registers the // creator and the assigned name. function OwnedToken(bytes32 _name) { owner = msg.sender; // We do an explicit type conversion from `address` // to `TokenCreator` and assume that the type of // the calling contract is TokenCreator, there is // no real way to check that. creator = TokenCreator(msg.sender); name = _name; } function changeName(bytes32 newName) { // Only the creator can alter the name -- // the comparison is possible since contracts // are implicitly convertible to addresses. if (msg.sender == creator) name = newName; } function transfer(address newOwner) { // Only the current owner can transfer the token. if (msg.sender != owner) return; // We also want to ask the creator if the transfer // is fine. Note that this calls a function of the // contract defined below. If the call fails (e.g. // due to out-of-gas), the execution here stops // immediately. if (creator.isTokenTransferOK(owner, newOwner)) owner = newOwner; } } contract TokenCreator { function createToken(bytes32 name) returns (OwnedToken tokenAddress) { // Create a new Token contract and return its address. // From the JavaScript side, the return type is simply // "address", as this is the closest type available in // the ABI. return new OwnedToken(name); } function changeName(OwnedToken tokenAddress, bytes32 name) { // Again, the external type of "tokenAddress" is // simply "address". tokenAddress.changeName(name); } function isTokenTransferOK( address currentOwner, address newOwner ) returns (bool ok) { // Check some arbitrary condition. address tokenAddress = msg.sender; return (sha3(newOwner) & 0xff) == (bytes20(tokenAddress) & 0xff); } }
Visibility and Accessors¶
Since Solidity knows two kinds of function calls (internal ones that do not create an actual EVM call (also called a “message call”) and external ones that do), there are four types of visibilities for functions and state variables.
Functions can be specified as being external, public, internal or private, where the default is public. For state variables, external is not possible and the default is internal.
- external:
- External functions are part of the contract interface, which means they can be called from other contracts and via transactions. An external function f cannot be called internally (i.e. f() does not work, but this.f() works). External functions are sometimes more efficient when they receive large arrays of data.
- public:
- Public functions are part of the contract interface and can be either called internally or via messages. For public state variables, an automatic accessor function (see below) is generated.
- internal:
- Those functions and state variables can only be accessed internally (i.e. from within the current contract or contracts deriving from it), without using this.
- private:
- Private functions and state variables are only visible for the contract they are defined in and not in derived contracts.
Note
Everything that is inside a contract is visible to all external observers. Making something private only prevents other contract from accessing and modifying the information, but it will still be visible to the whole world outside of the blockchain.
The visibility specifier is given after the type for state variables and between parameter list and return parameter list for functions.
contract c { function f(uint a) private returns (uint b) { return a + 1; } function setData(uint a) internal { data = a; } uint public data; }
Other contracts can call c.data() to retrieve the value of data in state storage, but are not able to call f. Contracts derived from c can call setData to alter the value of data (but only in their own state).
Accessor Functions¶
The compiler automatically creates accessor functions for all public state variables. The contract given below will have a function called data that does not take any arguments and returns a uint, the value of the state variable data. The initialization of state variables can be done at declaration.
The accessor functions have external visibility. If the symbol is accessed internally (i.e. without this.), it is a state variable and if it is accessed externally (i.e. with this.), it is a function.
contract test { uint public data = 42; }
The next example is a bit more complex:
contract complex { struct Data { uint a; bytes3 b; mapping(uint => uint) map; } mapping(uint => mapping(bool => Data[])) public data; }
It will generate a function of the following form:
function data(uint arg1, bool arg2, uint arg3) returns (uint a, bytes3 b) { a = data[arg1][arg2][arg3].a; b = data[arg1][arg2][arg3].b; }
Note that the mapping in the struct is omitted because there is no good way to provide the key for the mapping.
Function Modifiers¶
Modifiers can be used to easily change the behaviour of functions, for example to automatically check a condition prior to executing the function. They are inheritable properties of contracts and may be overridden by derived contracts.
contract owned { function owned() { owner = msg.sender; } address owner; // This contract only defines a modifier but does not use // it - it will be used in derived contracts. // The function body is inserted where the special symbol // "_" in the definition of a modifier appears. // This means that if the owner calls this function, the // function is executed and otherwise, an exception is // thrown. modifier onlyowner { if (msg.sender != owner) throw; _ } } contract mortal is owned { // This contract inherits the "onlyowner"-modifier from // "owned" and applies it to the "close"-function, which // causes that calls to "close" only have an effect if // they are made by the stored owner. function close() onlyowner { selfdestruct(owner); } } contract priced { // Modifiers can receive arguments: modifier costs(uint price) { if (msg.value >= price) _ } } contract Register is priced, owned { mapping (address => bool) registeredAddresses; uint price; function Register(uint initialPrice) { price = initialPrice; } function register() costs(price) { registeredAddresses[msg.sender] = true; } function changePrice(uint _price) onlyowner { price = _price; } }
Multiple modifiers can be applied to a function by specifying them in a whitespace-separated list and will be evaluated in order. Explicit returns from a modifier or function body immediately leave the whole function, while control flow reaching the end of a function or modifier body continues after the “_” in the preceding modifier. Arbitrary expressions are allowed for modifier arguments and in this context, all symbols visible from the function are visible in the modifier. Symbols introduced in the modifier are not visible in the function (as they might change by overriding).
Constants¶
State variables can be declared as constant (this is not yet implemented for array and struct types and not possible for mapping types).
contract C { uint constant x = 32**22 + 8; string constant text = "abc"; }
This has the effect that the compiler does not reserve a storage slot for these variables and every occurrence is replaced by their constant value.
The value expression can only contain integer arithmetics.
Fallback Function¶
A contract can have exactly one unnamed function. This function cannot have arguments and is executed on a call to the contract if none of the other functions matches the given function identifier (or if no data was supplied at all).
Furthermore, this function is executed whenever the contract receives plain Ether (witout data). In such a context, there is very little gas available to the function call, so it is important to make fallback functions as cheap as possible.
contract Test { function() { x = 1; } uint x; } // This contract rejects any Ether sent to it. It is good // practise to include such a function for every contract // in order not to loose Ether. contract Rejector { function() { throw; } } contract Caller { function callTest(address testAddress) { Test(testAddress).call(0xabcdef01); // hash does not exist // results in Test(testAddress).x becoming == 1. Rejector r = Rejector(0x123); r.send(2 ether); // results in r.balance == 0 } }
Events¶
Events allow the convenient usage of the EVM logging facilities, which in turn can be used to “call” JavaScript callbacks in the user interface of a dapp, which listen for these events.
Events are inheritable members of contracts. When they are called, they cause the arguments to be stored in the transaction’s log - a special data structure in the blockchain. These logs are associated with the address of the contract and will be incorporated into the blockchain and stay there as long as a block is accessible (forever as of Frontier and Homestead, but this might change with Serenity). Log and event data is not accessible from within contracts (not even from the contract that created a log).
SPV proofs for logs are possible, so if an external entity supplies a contract with such a proof, it can check that the log actually exists inside the blockchain (but be aware of the fact that ultimately, also the block headers have to be supplied because the contract can only see the last 256 block hashes).
Up to three parameters can receive the attribute indexed which will cause the respective arguments to be searched for: It is possible to filter for specific values of indexed arguments in the user interface.
If arrays (including string and bytes) are used as indexed arguments, the sha3-hash of it is stored as topic instead.
The hash of the signature of the event is one of the topics except if you declared the event with anonymous specifier. This means that it is not possible to filter for specific anonymous events by name.
All non-indexed arguments will be stored in the data part of the log.
contract ClientReceipt { event Deposit( address indexed _from, bytes32 indexed _id, uint _value ); function deposit(bytes32 _id) { // Any call to this function (even deeply nested) can // be detected from the JavaScript API by filtering // for `Deposit` to be called. Deposit(msg.sender, _id, msg.value); } }
The use in the JavaScript API would be as follows:
var abi = /* abi as generated by the compiler */; var ClientReceipt = web3.eth.contract(abi); var clientReceipt = ClientReceipt.at(0x123 /* address */); var event = clientReceipt.Deposit(); // watch for changes event.watch(function(error, result){ // result will contain various information // including the argumets given to the Deposit // call. if (!error) console.log(result); }); // Or pass a callback to start watching immediately var event = clientReceipt.Deposit(function(error, result) { if (!error) console.log(result); });
Low-Level Interface to Logs¶
It is also possible to access the low-level interface to the logging mechanism via the functions log0, log1, log2, log3 and log4. logi takes i + 1 parameter of type bytes32, where the first argument will be used for the data part of the log and the others as topics. The event call above can be performed in the same way as
log3( msg.value, 0x50cb9fe53daa9737b786ab3646f04d0150dc50ef4e75f59509d83667ad5adb20, msg.sender, _id );
where the long hexadecimal number is equal to sha3(“Deposit(address,hash256,uint256)”), the signature of the event.
Additional Resources for Understanding Events¶
Inheritance¶
Solidity supports multiple inheritance by copying code including polymorphism.
All function calls are virtual, which means that the most derived function is called, except when the contract is explicitly given.
Even if a contract inherits from multiple other contracts, only a single contract is created on the blockchain, the code from the base contracts is always copied into the final contract.
The general inheritance system is very similar to Python’s, especially concerning multiple inheritance.
Details are given in the following example.
contract owned { function owned() { owner = msg.sender; } address owner; } // Use "is" to derive from another contract. Derived // contracts can access all non-private members including // internal functions and state variables. These cannot be // accessed externally via `this`, though. contract mortal is owned { function kill() { if (msg.sender == owner) selfdestruct(owner); } } // These abstract contracts are only provided to make the // interface known to the compiler. Note the function // without body. If a contract does not implement all // functions it can only be used as an interface. contract Config { function lookup(uint id) returns (address adr); } contract NameReg { function register(bytes32 name); function unregister(); } // Multiple inheritance is possible. Note that "owned" is // also a base class of "mortal", yet there is only a single // instance of "owned" (as for virtual inheritance in C++). contract named is owned, mortal { function named(bytes32 name) { Config config = Config(0xd5f9d8d94886e70b06e474c3fb14fd43e2f23970); NameReg(config.lookup(1)).register(name); } // Functions can be overridden, both local and // message-based function calls take these overrides // into account. function kill() { if (msg.sender == owner) { Config config = Config(0xd5f9d8d94886e70b06e474c3fb14fd43e2f23970); NameReg(config.lookup(1)).unregister(); // It is still possible to call a specific // overridden function. mortal.kill(); } } } // If a constructor takes an argument, it needs to be // provided in the header (or modifier-invocation-style at // the constructor of the derived contract (see below)). contract PriceFeed is owned, mortal, named("GoldFeed") { function updateInfo(uint newInfo) { if (msg.sender == owner) info = newInfo; } function get() constant returns(uint r) { return info; } uint info; }
Note that above, we call mortal.kill() to “forward” the destruction request. The way this is done is problematic, as seen in the following example:
contract mortal is owned { function kill() { if (msg.sender == owner) selfdestruct(owner); } } contract Base1 is mortal { function kill() { /* do cleanup 1 */ mortal.kill(); } } contract Base2 is mortal { function kill() { /* do cleanup 2 */ mortal.kill(); } } contract Final is Base1, Base2 { }
A call to Final.kill() will call Base2.kill as the most derived override, but this function will bypass Base1.kill, basically because it does not even know about Base1. The way around this is to use super:
contract mortal is owned { function kill() { if (msg.sender == owner) selfdestruct(owner); } } contract Base1 is mortal { function kill() { /* do cleanup 1 */ super.kill(); } } contract Base2 is mortal { function kill() { /* do cleanup 2 */ super.kill(); } } contract Final is Base2, Base1 { }
If Base1 calls a function of super, it does not simply call this function on one of its base contracts, it rather calls this function on the next base contract in the final inheritance graph, so it will call Base2.kill() (note that the final inheritance sequence is – starting with the most derived contract: Final, Base1, Base2, mortal, owned). The actual function that is called when using super is not known in the context of the class where it is used, although its type is known. This is similar for ordinary virtual method lookup.
Arguments for Base Constructors¶
Derived contracts need to provide all arguments needed for the base constructors. This can be done at two places:
contract Base { uint x; function Base(uint _x) { x = _x; } } contract Derived is Base(7) { function Derived(uint _y) Base(_y * _y) { } }
Either directly in the inheritance list (is Base(7)) or in the way a modifier would be invoked as part of the header of the derived constructor (Base(_y * _y)). The first way to do it is more convenient if the constructor argument is a constant and defines the behaviour of the contract or describes it. The second way has to be used if the constructor arguments of the base depend on those of the derived contract. If, as in this silly example, both places are used, the modifier-style argument takes precedence.
Multiple Inheritance and Linearization¶
Languages that allow multiple inheritance have to deal with several problems, one of them being the Diamond Problem. Solidity follows the path of Python and uses “C3 Linearization” to force a specific order in the DAG of base classes. This results in the desirable property of monotonicity but disallows some inheritance graphs. Especially, the order in which the base classes are given in the is directive is important. In the following code, Solidity will give the error “Linearization of inheritance graph impossible”.
contract X {} contract A is X {} contract C is A, X {}
The reason for this is that C requests X to override A (by specifying A, X in this order), but A itself requests to override X, which is a contradiction that cannot be resolved.
A simple rule to remember is to specify the base classes in the order from “most base-like” to “most derived”.
Abstract Contracts¶
Contract functions can lack an implementation as in the following example (note that the function declaration header is terminated by ;):
contract feline { function utterance() returns (bytes32); }
Such contracts cannot be compiled (even if they contain implemented functions alongside non-implemented functions), but they can be used as base contracts:
contract Cat is feline { function utterance() returns (bytes32) { return "miaow"; } }
If a contract inherits from an abstract contract and does not implement all non-implemented functions by overriding, it will itself be abstract.
Libraries¶
Libraries are similar to contracts, but their purpose is that they are deployed only once at a specific address and their code is reused using the DELEGATECALL (CALLCODE until homestead) feature of the EVM. This means that if library functions are called, their code is executed in the context of the calling contract, i.e. this points to the calling contract and especially the storage from the calling contract can be accessed. As a library is an isolated piece of source code, it can only access state variables of the calling contract if they are explicitly supplied (it would have to way to name them, otherwise).
The following example illustrates how to use libraries (but be sure to check out using for for a more advanced example to implement a set).
library Set { // We define a new struct datatype that will be used to // hold its data in the calling contract. struct Data { mapping(uint => bool) flags; } // Note that the first parameter is of type "storage // reference" and thus only its storage address and not // its contents is passed as part of the call. This is a // special feature of library functions. It is idiomatic // to call the first parameter 'self', if the function can // be seen as a method of that object. function insert(Data storage self, uint value) returns (bool) { if (self.flags[value]) return false; // already there self.flags[value] = true; return true; } function remove(Data storage self, uint value) returns (bool) { if (!self.flags[value]) return false; // not there self.flags[value] = false; return true; } function contains(Data storage self, uint value) returns (bool) { return self.flags[value]; } } contract C { Set.Data knownValues; function register(uint value) { // The library functions can be called without a // specific instance of the library, since the // "instance" will be the current contract. if (!Set.insert(knownValues, value)) throw; } // In this contract, we can also directly access knownValues.flags, if we want. }
Of course, you do not have to follow this way to use libraries - they can also be used without defining struct data types, functions also work without any storage reference parameters, can have multiple storage reference parameters and in any position.
The calls to Set.contains, Set.insert and Set.remove are all compiled as calls (DELEGATECALLs) to an external contract/library. If you use libraries, take care that an actual external function call is performed.
msg.sender, msg.value and this will retain their values in this call, though (prior to Homestead, msg.sender and msg.value changed, though).
As the compiler cannot know where the library will be deployed at, these addresses have to be filled into the final bytecode by a linker (see Using the Commandline Compiler on how to use the commandline compiler for linking). If the addresses are not given as arguments to the compiler, the compiled hex code will contain placeholders of the form Set__ (where Set is the name of the library). The address can be filled manually by replacing all those 40 symbols by the hex encoding of the address of the library contract.
Restrictions for libraries in comparison to contracts:
- no state variables
- cannot inherit nor be inherited
(these might be lifted at a later point)
Using For¶
The directive using A for B; can be used to attach library functions (from the library A) to any type (B). These functions will receive the object they are called on as their first parameter (like the self variable in Python).
The effect of using A for *; is that the functions from the library A are attached to any type.
In both situations, all functions, even those where the type of the first parameter does not match the type of the object, are attached. The type is checked at the point the function is called and function overload resolution is performed.
The using A for B; directive is active for the current scope, which is limited to a contract for now but will be lifted to the global scope later, so that by including a module, its data types including library functions are available without having to add further code.
Let us rewrite the set example from the Libraries in this way:
// This is the same code as before, just without comments library Set { struct Data { mapping(uint => bool) flags; } function insert(Data storage self, uint value) returns (bool) { if (self.flags[value]) return false; // already there self.flags[value] = true; return true; } function remove(Data storage self, uint value) returns (bool) { if (!self.flags[value]) return false; // not there self.flags[value] = false; return true; } function contains(Data storage self, uint value) returns (bool) { return self.flags[value]; } } contract C { using Set for Set.Data; // this is the crucial change Set.Data knownValues; function register(uint value) { // Here, all variables of type Set.Data have // corresponding member functions. // The following function call is identical to // Set.insert(knownValues, value) if (!knownValues.insert(value)) throw; } }
It is also possible to extend elementary types in that way:
library Search { function indexOf(uint[] storage self, uint value) { for (uint i = 0; i < self.length; i++) if (self[i] == value) return i; return uint(-1); } } contract C { using Search for uint[]; uint[] data; function append(uint value) { data.push(value); } function replace(uint _old, uint _new) { // This performs the library function call uint index = data.find(_old); if (index == -1) data.push(_new); else data[index] = _new; } }
Note that all library calls are actual EVM function calls. This means that if you pass memory or value types, a copy will be performed, even of the self variable. The only situation where no copy will be performed is when storage reference variables are used.
Miscellaneous¶
Layout of State Variables in Storage¶
Statically-sized variables (everything except mapping and dynamically-sized array types) are laid out contiguously in storage starting from position 0. Multiple items that need less than 32 bytes are packed into a single storage slot if possible, according to the following rules:
- The first item in a storage slot is stored lower-order aligned.
- Elementary types use only that many bytes that are necessary to store them.
- If an elementary type does not fit the remaining part of a storage slot, it is moved to the next storage slot.
- Structs and array data always start a new slot and occupy whole slots (but items inside a struct or array are packed tightly according to these rules).
The elements of structs and arrays are stored after each other, just as if they were given explicitly.
Due to their unpredictable size, mapping and dynamically-sized array types use a sha3 computation to find the starting position of the value or the array data. These starting positions are always full stack slots.
The mapping or the dynamic array itself occupies an (unfilled) slot in storage at some position p according to the above rule (or by recursively applying this rule for mappings to mappings or arrays of arrays). For a dynamic array, this slot stores the number of elements in the array (byte arrays and strings are an exception here, see below). For a mapping, the slot is unused (but it is needed so that two equal mappings after each other will use a different hash distribution). Array data is located at sha3(p) and the value corresponding to a mapping key k is located at sha3(k . p) where . is concatenation. If the value is again a non-elementary type, the positions are found by adding an offset of sha3(k . p).
bytes and string store their data in the same slot where also the length is stored if they are short. In particular: If the data is at most 31 bytes long, it is stored in the higher-order bytes (left aligned) and the lowest-order byte stores length * 2. If it is longer, the main slot stores length * 2 + 1 and the data is stored as usual in sha3(slot).
So for the following contract snippet:
contract c { struct S { uint a; uint b; } uint x; mapping(uint => mapping(uint => S)) data; }
The position of data[4][9].b is at sha3(uint256(9) . sha3(uint256(4) . uint256(1))) + 1.
Esoteric Features¶
There are some types in Solidity’s type system that have no counterpart in the syntax. One of these types are the types of functions. But still, using var it is possible to have local variables of these types:
contract FunctionSelector { function select(bool useB, uint x) returns (uint z) { var f = a; if (useB) f = b; return f(x); } function a(uint x) returns (uint z) { return x * x; } function b(uint x) returns (uint z) { return 2 * x; } }
Calling select(false, x) will compute x * x and select(true, x) will compute 2 * x.
Internals - the Optimizer¶
The Solidity optimizer operates on assembly, so it can be and also is used by other languages. It splits the sequence of instructions into basic blocks at JUMPs and JUMPDESTs. Inside these blocks, the instructions are analysed and every modification to the stack, to memory or storage is recorded as an expression which consists of an instruction and a list of arguments which are essentially pointers to other expressions. The main idea is now to find expressions that are always equal (on every input) and combine them into an expression class. The optimizer first tries to find each new expression in a list of already known expressions. If this does not work, the expression is simplified according to rules like constant + constant = sum_of_constants or X * 1 = X. Since this is done recursively, we can also apply the latter rule if the second factor is a more complex expression where we know that it will always evaluate to one. Modifications to storage and memory locations have to erase knowledge about storage and memory locations which are not known to be different: If we first write to location x and then to location y and both are input variables, the second could overwrite the first, so we actually do not know what is stored at x after we wrote to y. On the other hand, if a simplification of the expression x - y evaluates to a non-zero constant, we know that we can keep our knowledge about what is stored at x.
At the end of this process, we know which expressions have to be on the stack in the end and have a list of modifications to memory and storage. This information is stored together with the basic blocks and is used to link them. Furthermore, knowledge about the stack, storage and memory configuration is forwarded to the next block(s). If we know the targets of all JUMP and JUMPI instructions, we can build a complete control flow graph of the program. If there is only one target we do not know (this can happen as in principle, jump targets can be computed from inputs), we have to erase all knowledge about the input state of a block as it can be the target of the unknown JUMP. If a JUMPI is found whose condition evaluates to a constant, it is transformed to an unconditional jump.
As the last step, the code in each block is completely re-generated. A dependency graph is created from the expressions on the stack at the end of the block and every operation that is not part of this graph is essentially dropped. Now code is generated that applies the modifications to memory and storage in the order they were made in the original code (dropping modifications which were found not to be needed) and finally, generates all values that are required to be on the stack in the correct place.
These steps are applied to each basic block and the newly generated code is used as replacement if it is smaller. If a basic block is split at a JUMPI and during the analysis, the condition evaluates to a constant, the JUMPI is replaced depending on the value of the constant, and thus code like
var x = 7; data[7] = 9; if (data[x] != x + 2) return 2; else return 1;
is simplified to code which can also be compiled from
data[7] = 9; return 1;
even though the instructions contained a jump in the beginning.
Using the Commandline Compiler¶
One of the build targets of the Solidity repository is solc, the solidity commandline compiler. Using solc –help provides you with an explanation of all options. The compiler can produce various outputs, ranging from simple binaries and assembly over an abstract syntax tree (parse tree) to estimations of gas usage. If you only want to compile a single file, you run it as solc –bin sourceFile.sol and it will print the binary. Before you deploy your contract, activate the optimizer while compiling using solc –optimize –bin sourceFile.sol. If you want to get some of the more advanced output variants of solc, it is probably better to tell it to output everything to separate files using solc -o outputDirectory –bin –ast –asm sourceFile.sol.
The commandline compiler will automatically read imported files from the filesystem, but it is also possible to provide path redirects using prefix=path in the following way:
solc github.com/ethereum/dapp-bin/=/usr/local/lib/dapp-bin/ =/usr/local/lib/fallback file.sol
This essentially instructs the compiler to search for anything starting with github.com/ethereum/dapp-bin/ under /usr/local/lib/dapp-bin and if it does not find the file there, it will look at /usr/local/lib/fallback (the empty prefix always matches). solc will not read files from the filesystem that lie outside of the remapping targets and outside of the directories where explicitly specified source files reside, so things like import “/etc/passwd”; only work if you add =/ as a remapping.
If there are multiple matches due to remappings, the one with the longest common prefix is selected.
If your contracts use libraries, you will notice that the bytecode contains substrings of the form LibraryName__. You can use solc as a linker meaning that it will insert the library addresses for you at those points:
Either add –libraries “Math:0x12345678901234567890 Heap:0xabcdef0123456” to your command to provide an address for each library or store the string in a file (one library per line) and run solc using –libraries fileName.
If solc is called with the option –link, all input files are interpreted to be unlinked binaries (hex-encoded) in the LibraryName__-format given above and are linked in-place (if the input is read from stdin, it is written to stdout). All options except –libraries are ignored (including -o) in this case.
Tips and Tricks¶
- Use delete on arrays to delete all its elements.
- Use shorter types for struct elements and sort them such that short types are grouped together. This can lower the gas costs as multiple SSTORE operations might be combined into a single (SSTORE costs 5000 or 20000 gas, so this is what you want to optimise). Use the gas price estimator (with optimiser enabled) to check!
- Make your state variables public - the compiler will create getters for you for free.
- If you end up checking conditions on input or state a lot at the beginning of your functions, try using Function Modifiers.
- If your contract has a function called send but you want to use the built-in send-function, use address(contractVariable).send(amount).
- If you do not want your contracts to receive ether when called via send, you can add a throwing fallback function function() { throw; }.
- Initialise storage structs with a single assignment: x = MyStruct({a: 1, b: 2});
Pitfalls¶
Unfortunately, there are some subtleties the compiler does not yet warn you about.
- In for (var i = 0; i < arrayName.length; i++) { ... }, the type of i will be uint8, because this is the smallest type that is required to hold the value 0. If the array has more than 255 elements, the loop will not terminate.
Cheatsheet¶
Global Variables¶
- block.coinbase (address): current block miner’s address
- block.difficulty (uint): current block difficulty
- block.gaslimit (uint): current block gaslimit
- block.number (uint): current block number
- block.blockhash (function(uint) returns (bytes32)): hash of the given block - only works for 256 most recent blocks
- block.timestamp (uint): current block timestamp
- msg.data (bytes): complete calldata
- msg.gas (uint): remaining gas
- msg.sender (address): sender of the message (current call)
- msg.value (uint): number of wei sent with the message
- now (uint): current block timestamp (alias for block.timestamp)
- tx.gasprice (uint): gas price of the transaction
- tx.origin (address): sender of the transaction (full call chain)
- sha3(...) returns (bytes32): compute the Ethereum-SHA3 hash of the (tightly packed) arguments
- sha256(...) returns (bytes32): compute the SHA256 hash of the (tightly packed) arguments
- ripemd160(...) returns (bytes20): compute RIPEMD of 256 the (tightly packed) arguments
- ecrecover(bytes32, uint8, bytes32, bytes32) returns (address): recover public key from elliptic curve signature
- addmod(uint x, uint y, uint k) returns (uint): compute (x + y) % k where the addition is performed with arbitrary precision and does not wrap around at 2**256.
- mulmod(uint x, uint y, uint k) returns (uint): compute (x * y) % k where the multiplication is performed with arbitrary precision and does not wrap around at 2**256.
- this (current contract’s type): the current contract, explicitly convertible to address
- super: the contract one level higher in the inheritance hierarchy
- selfdestruct(address): destroy the current contract, sending its funds to the given address
- <address>.balance: balance of the address in Wei
- <address>.send(uint256) returns (bool): send given amount of Wei to address, returns false on failure.
Function Visibility Specifiers¶
function myFunction()returns (bool) { return true; }
- public: visible externally and internally (creates accessor function for storage/state variables)
- private: only visible in the current contract
- external: only visible externally (only for functions) - i.e. can only be message-called (via this.fun)
- internal: only visible internally
Modifiers¶
- constant for state variables: Disallows assignment (except initialisation), does not occupy storage slot.
- constant for functions: Disallows modification of state - this is not enforced yet.
- anonymous for events: Does not store event signature as topic.
- indexed for event parameters: Stores the parameter as topic.