8 min read

Strings

Introduction

Strings in Solidity are a primitive data type that represent a sequence of characters. Strings can be used to store text, such as names, addresses, and other data.

In Solidity, strings are stored as arrays of type bytes, and many of the common string operations such as concatenation, slicing, and comparison can be performed using the bytes data type. However, manipulating strings directly on-chain can be expensive in terms of gas costs and computation time, so it is often recommended to perform string manipulations off-chain if possible. Ideally, this would occur in the browser, and the updates sent to the contract.

Using third-party string libraries can help simplify string manipulations on-chain and reduce the amount of code that needs to be written, but it is important to ensure that these libraries are secure and do not introduce any vulnerabilities into the contract.

As the costs of interacting with smart contracts decrease with layer-2 solutions, it may become more feasible to perform non-critical string manipulations on-chain. However, it is still important to consider the cost-benefit trade-off of performing these operations on-chain versus off-chain, and to ensure that the contract remains secure and efficient.

In this post, we will be covering strings and string manipulation.

Declaring and Initializing Strings

String literals are constant values that can be used in Solidity code. String literals are enclosed in double or single quotes, and can contain any combination of characters. Strings can be declared and initialized using the following syntax:

string memory str = 'hello world';
string memory str2 = "hello world";

In this code, the string 'hello world' can be initialized with double or single quotes.

On-Chain String Manipulation

It is advised that you perform these operations off-chain and then update the state of the string. However, the following are examples of how to perform on-chain string manipulation.

String to Bytes

  • string(): type casting to turn into string into bytes.
string memory str = 'hello world';
bytes memory myBytes = bytes(str); // to bytes array

Bytes to String

  • string(): type casting to turn bytes into string.
string memory str = 'hello world'; 
bytes memory myBytes = bytes(str); // to bytes array
string memory bytesToStr = string(myBytes); // back to string again

String Length

  • sampleBytes.length: gets length of bytes array representation equal to the string length
string memory str = 'hello world';
bytes memory myBytes = bytes(str);
uint256 length = myBytes.length;  // 11 - only assignable to uint

Get Letter based on Index

In Solidity 0.8.x, index access for string is not possible. Instead we must convert the string into a bytes data type, access the value, and convert it back to a string data type.

  • sampleByte\[x\]: access index x of the zero indexed byte array.
string memory str = 'hello world';
bytes memory myBytes = bytes(str);
bytes memory  myByte = myBytes[0]; // 'h' - get first letter, but in byte format

Display Letter as String

Here we convert it back to a string from a bytes data type using the string() type conversion method.

  • sampleBytes[x]: get string representation of letter in x position.
  • string(sampleByte): type conversion back to a string from a byte.
string memory str = 'hello world';
bytes memory myBytes = bytes(str);
bytes memory  myByte = myBytes[0]; //  get first letter, but in byte format
string memory oneLetter = string(oneByte); // 'h' - transform byte to string

Note: it is not possible to nest multiple conversions similar to JavaScript.

string memory oneLetter = string(bytes(str)[x])` // TypeError: Explicit type conversion not allowed from "bytes1" to "string memory". 

Concatenating Strings

  • sampleString.concat(): Concatenates two strings using concat() built in string method. Available in Solidity 0.8.12 and above.
  • sampleByte.concat(): Concatenates two strings using concat() built in bytes method.
  • Prior to string.concat(), the method to join strings together was via using ABI Encode via abi.encodedPacked:
// using built in string method
string memory str = "hello world";
string memory strPlus = str.concat('+++', '!!!', '$$$');

// using built in bytes method
string str1 = "hello ";
string str2 = "world";
bytes memory myBytes1 = bytes(str1);
bytes memory myBytes2 = bytes(str2);
bytes memory concatBytes = bytes.concat(myBytes1, myBytes2);
string concatenatedString = string(concatBytes);

// old method prior to Solidity 0.8.12
string memory str = "hello world";
string memory concatStr = string(abi.encodePacked(str, "!!!")); // "hello world!!!"

Comparing Two Strings

Solidity as of 0.8.x does not have a native method to compare two strings.

To do so, you'd need to:

  • use abi.encode() to transform them into bytes arrays
  • compare two compute the hash using keccak256().
string memory str1 = "hello";
string memory str2 = "world";
string memory str3 = "hello";

function compareStrings(string memory a, string memory b) internal pure returns (bool) {
    return (keccak256(abi.encodePacked((a))) == keccak256(abi.encodePacked((b))));
}

bool memory str1AndStr2 = compareStrings(str1, str2);  // false
bool memory str1AndStr3 = compareStrings(str1, str3);  // true

Extracting a Substring from a String

Currently, in Solidity 0.8.x there isn't an equivalent method for slice(). The familiar syntax for accessing string or array slices found in JavaScript or Python like sampleStr[start:end] is only available for bytes variables pointing to calldata, not memory. The difference between different memory locations is explained in Data Locations Storage Memory CallData.

Looping method

To get around this, the following method could be used. The require statements are used to check to test data validity. If the data check fails, the transaction on the Ethereum blockchain reverts, and the code's state reverts back to before the transaction.

// without using calldata or bytes slices like myBytes[0:5]
// helper function
function getBytesSubstring(bytes memory data, uint start, uint end) public pure returns (bytes memory) {
    require(end >= start, "Invalid substring range");
    require(end <= data.length, "End index out of range");
    bytes memory result = new bytes(end - start);
    for (uint i = start; i < end; i++) {
        result[i - start] = data[i];
    }
    return result;
}

// function to get string
function subString(string memory str, uint start, uint end) external pure returns (string memory) {
    bytes memory strBytes = bytes(str);
    bytes memory resultBytes = getBytesSubstring(strBytes, start, end);
    string memory resultString = string(resultBytes);
    return resultString;
}

string memory str = 'hello world';
string memor strSubString = subString(str, 0, 5);

Using array slicing

We can experiment using byte slices that leverage calldata data locations in the following way.

The following code throws an error because for calldata in getBytesSubstring() to work, it needs:

  • to be in a contract
  • that is called externally
// not in a contract, so calldata does not work and results in an error
function getBytesSubstring(bytes calldata data, uint start, uint end) public view returns (bytes memory) {
	require(end >= start, "Invalid substring range");
	require(end <= data.length, "End index out of range");
	return data[start : end];
    }

function subString(string calldata str, uint start, uint end) public view returns (string memory) {
	bytes memory resultBytes = getBytesSubstring(bytes(str), start, end);
	return string(resultBytes);
}

string memory myString = "Hello world";
string memory mySubstring = subString(myString, 0, 5);

If ran this way, the following similar error will be thrown:

Compiler errors:

error[9553]: TypeError: Invalid type for argument in function call. Invalid implicit conversion from string memory to string calldata requested.

string memory mySubstring = subString(myString, 0, 5);

To get the code to work, it must be included in a Contract called externally.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.0;

// create StringManipulation Contract
contract StringManipulation {
    function getBytesSubstring(bytes calldata data, uint start, uint end) public view returns (bytes memory) {
        require(end >= start, "Invalid substring range");
        require(end <= data.length, "End index out of range");
        return data[start : end]; // use array slicing only available on calldata bytes arrays
    }

    function subString(string calldata str, uint start, uint end) public view returns (string memory) {
        bytes memory resultBytes = getBytesSubstring(bytes(str), start, end); // convert to bytes
        return string(resultBytes); // return result converted into a string
    }
}

// Access method to perform manipulations
StringManipulation stringManipulation = new StringManipulation();  // initialize contract

string memory myString = "Hello world";
string memory mySubstring = stringManipulation.subString(myString, 0, 5); // returns 'hello'

This code defines a Solidity contract called StringManipulation that provides two functions for manipulating strings:

  1. getBytesSubstring: This function takes a bytes array data, a starting index start, and an ending index end, and returns a new bytes array that is a slice of the original data array starting at the start index and ending at the end index (excluding the end index itself).
  2. subString: This function takes a string str, a starting index start, and an ending index end, and returns a new string that is a substring of the original str starting at the start index and ending at the end index (excluding the end index itself).

Both of these functions are declared as view, meaning they do not modify any state variables or perform any state-changing operations.

The code also declares a new instance of the StringManipulation contract called stringManipulation using the new keyword, which initializes a new instance of the StringManipulation contract.

Finally, the code defines a string myString with the value "Hello world", and calls the subString function on stringManipulation to get the substring from index 0 to 5 (which is "Hello"), and assigns the result to the mySubstring variable.

To use this code, you would need to deploy the StringManipulation contract to a blockchain and then call the subString function on an instance of the contract, passing in the string and indices you want to extract a substring from.

Display Character at Position x

Similarly, we can apply the same principles above to use access an array index within a string.

contract StringManipulation {
    function getByteAtIndex(bytes calldata data, uint index) public view returns (bytes1) {
        require(index < data.length, "Index out of range");
        return data[index];
    }

    function getCharAtIndex(string calldata str, uint index) public view returns (string memory) {
        bytes1 charByte = getByteAtIndex(bytes(str), index);
        return string(abi.encodePacked(charByte));
    }
}

// interacting with contract
// create new StringManipulation contract
 StringManipulation public stringManipulation = new StringManipulation();

// get values
string memory myString = "Hello world";
string memory myChar = stringManipulation.getCharAtIndex(myString, 6); // returns w

This code defines a Solidity contract called StringManipulation that provides two functions for manipulating strings:

  1. getByteAtIndex: This function takes a bytes array data and an index index, and returns the byte value at the specified index as a bytes1 value.
  2. getCharAtIndex: This function takes a string str and an index index, and returns the character at the specified index as a string.

Both of these functions are declared as view, meaning they do not modify any state variables or perform any state-changing operations.

The code also declares a new instance of the StringManipulation contract called stringManipulation using the public keyword, which makes it publicly accessible from outside the contract.

Finally, the code defines a string myString with the value "Hello world", and calls the getCharAtIndex function on stringManipulation to get the character at index 6 (which is "w") and assigns the result to the myChar variable. The getCharAtIndex function works by first calling getByteAtIndex to get the byte value at the specified index, then transforming the byte value to a string using abi.encodePacked.

To use this code, you would need to deploy the StringManipulation contract to a blockchain and then call the getCharAtIndex function on an instance of the contract, passing in the string and index you want to extract a character from.

Replace in Position

Splitting a String

Add code 

Reverse String

Ipsem Lorem

bytes memory str = "HELLO WORLD";

Copy string methods form JS and python and implement in Solidity.

Off-chain String Manipulation

Circa March 2022, The best practice is to perform any string manipulations off-chain and the update the state of the string. These manipulations will happen with Ethers.js, a library to interact with the Ethereum Blockchain and its ecosystem.

To do this let's look at our SimpleStorage contract has:

  • data to take a string
  • a public method to return the string
  • a setter method to update the string.

Conclusion

In conclusion, strings are an important aspect of Solidity programming. Strings can be declared and initialized using the standard string syntax. Solidity provides several functions and operators that can be used with strings to manipulate and concatenate them. Understanding how to work with strings in Solidity is important for writing safe and efficient smart contracts.

Feedback

Have feedback? Found something that needs to be updated, accurate, or explained?

Join our discord and share what can be improved.

Any and all feedback is welcomed.