How to put data into the SMPP SM field
In SMPP (Short Message Peer-to-Peer), the Short Message (SM) payload is the actual text (or binary data) of the SMS message being transmitted. To ensure the correct interpretation of this content, the data_coding field in the PDU plays a critical role by indicating the encoding format of the message.
Common SMPP Encodings
The data_coding
field is 1 byte and informs the SMSC how to interpret the message payload.
Hex | Decimal | Encoding | Description |
---|---|---|---|
0x00 | 0 | GSM 7-bit default | Standard SMS character set |
0x01 | 1 | ASCII | 8-bit ASCII (Latin-1 subset) |
0x03 | 3 | Latin-1 (ISO 8859-1) | Western European charset |
0x08 | 8 | UCS2 | Unicode (16-bit, big-endian) |
0x04 | 4 | Binary | Raw 8-bit binary data |
Examples of Encoded Messages
1. GSM 7-bit (data_coding = 0x00)
Standard SMS encoding. Efficient (up to 160 characters in a single message).
Text: "Hello" GSM 7-bit Packed: C8 32 9B FD 06
2. UCS2 (data_coding = 0x08)
Used for non-Latin scripts (e.g., Arabic, Chinese, emojis). Supports 70 characters per message.
Text: "مرحبا" UCS2 Hex: 0645 0631 062D 0628 0627 Bytes (hex): 06 45 06 31 06 2D 06 28 06 27
3. ASCII (data_coding = 0x01)
Basic Latin characters only, less space-efficient than GSM 7-bit.
Text: "Hello" ASCII Hex: 48 65 6C 6C 6F
SMPP PDU Example with UCS2 Encoding
Here is an SMPP submit_sm
PDU carrying a Unicode message:
0000004B // Command Length (75 bytes) 00000004 // Command ID (submit_sm) 00000000 // Command Status 00000001 // Sequence Number 74657374 // service_type: "test" 01 // source_addr_ton: International 01 // source_addr_npi: ISDN 31323334 // source_addr: "1234" (ASCII) 00 01 // dest_addr_ton 01 // dest_addr_npi 35363738 // destination_addr: "5678" 00 00 // esm_class 00 // protocol_id 00 // priority_flag 00 // schedule_delivery_time 00 // validity_period 00 // registered_delivery 00 // replace_if_present_flag 08 // data_coding: UCS2 00 // sm_default_msg_id 0A // sm_length: 10 bytes 06450631 // Message in UCS2 (e.g. "مر") 062D0628
Encoding and Concatenation
Long messages are split into parts using UDH (User Data Header). This reduces max payload size:
- GSM 7-bit: 160 → 153 chars per part
- UCS2: 70 → 67 chars per part
Example UDH for message part:
05 00 03 CC 02 01 // 05: header length // 00 03: Concatenation IEI // CC: Message reference // 02: total parts // 01: current part
Summary
SMPP provides flexible encoding options through the data_coding
field. Proper encoding ensures compatibility across global networks, especially when handling multilingual text or binary data. Developers must match encoding types with the content and expected recipients to avoid message corruption.
References
- SMPP 3.4 Specification
- GSM 03.38 Character Set
- Unicode Standard
More information
- How to configure the SMPP Service Type field
- How to configure the SMPP Phone Number fields
- How to configure the SMPP ESM Class field
- How to configure the SMPP PID field
- How to configure the SMPP Prioirity field
- How to configure the SMPP Scheduled Time field
- How to configure the SMPP Validity Period field
- How to configure the SMPP Registered Delivery field
- How to configure the SMPP Replace if Present field
- How to configure the sm_default_msg_id field
- How to configure the SMPP DCS field
- How to calculate the SMPP SM Length field
- How to put data into the SMPP SM field