Proposal for secure data format

otso · 28 September 2019 08:15

Hello all,

As Ruuvi Node and Ruuvi Dongle are moving forward, we’re re-evaluating the security needs of RuuviTag data.

Up to this point accessing data from RuuviTags has required physical proximity due to limited range of BLE transmissions, but once the data is sent to a default server maintained by Ruuvi pretty much anyone from anywhere can access the data.

Security threats

Violating user privacy
– Especially the movement data can be used to track lifestyle of individuals if the owner of the RuuviTag is known. For example a tag on bed tells when user went to sleep, how peacefully they slept and when they woke up.

Spoofed data
– If user leaves a tag monitoring a remote location someone could trigger a false alarm by sending data which shows that water pipes are freezing etc.
Denial of service
– If someone can spam invalid data the invalid data may lead to excess bandwidth or storage usage. If data is downsampled, the value remaining after downsampling could be one of the spam values.

Protection against threats

Encrypting tag data
– The tag data is encoded with AES128 using a password and tag ID as the encryption key. For details about the encrypted data format please see this proposal. The attacker must have had physical access to scan the ID as well as know the user-configurable password.
Checksum on decrypted data, measurement counter
– The tag data contains a checksum of decrypted data, chances of anyone making a random string of bytes which matches the checksum is 1/256. Additionally the measurement counter value must be known by spoofer or spoof is detected by jumps in data.
Writing data to Ruuvi servers requires knowing MAC address
– The data is sent and queried by MAC addresses of the tags, which is a 46-bit random number. It is possible to query the address space to find some tags and start spamming them, or if the target MAC address is known it can be spammed. Any ideas on how to protect against this sort of attack? One solution is to use a dedicated server with access control, but we’d like to offer free (as in beer and speech) alternatives.

As always, we need to make compromises between security, convenience and pricing. Please let us know your ideas and feedback on this security model, at this point everything can be adjusted to meet the needs of the community.

ptrapnes · 2 October 2019 10:33

I took my self the freedom to link back to a competitor.
kontakt.io
I think they are on the right track… "but not free as in beer and the pricing could be better "

otso · 2 October 2019 10:54

Secure beaconing is already provided by Eddystone EID and eTLM which are a part of the Eddystone firmware for Ruuvi’s. Therefore the use case for such secure beaconing is already covered.

The kontakt.io model also seems to require contact to their servers:

Can I decipher the real beacon UUID, Major, Minor on the device, or do we need to contact the server to get the actual beacon identifiers?

You need to resolve the real values with our API or by using our SDK to get the value either on the fly or cached in advance locally.
[https://support.kontakt.io/hc/en-gb/articles/206770089-Frequently-Asked-Questions-about-Kontakt-io-Secure]

We want to keep the beacons useable in a non-connected environment.

Locking the firmware might be a reasonable additional precaution to protect the user-configured passwords, we’ll consider it.

ejk · 3 October 2019 18:14

IMO complete protection against DOS, is just about impossible, but encryption and checksum of the data provide some protection against DOS among other security threats.
The MAC addresses are broadcasted, so one would have to have some sort of algorithm changing the MAC address at some interval and vice versa on the receiving end.
The crudest and the most difficult to protect from type of DOS I can think of, is having a sufficiently powerful radio signal, either by accident or out of malice, at the 2,4 GHz frequency range. Military etc use frequency hopping to circumvent this vector, but this isn’t an option as long as Bluetooth is used.

dgerman · 9 November 2019 04:07

I present this discussion here (rather than on the GitHub see https://github.com/ruuvi/ruuvi-sensor-protocols/blob/master/dataformat_08.md in particular issues #26 and #28 ) so as to solicit responses from a larger user community. I realize some of this information is redundant and more specific detail is located at GitHub.

The bluetooth advertising pack restricts the payload size.

The current proposal includes:

Packet format number: 1 byte
Temperature in 0.005 degrees Centigrade 2 bytes
Humidity in 0.0025% 2 bytes
Atmospheric pressure in 1Pa 2 bytes
Power information, ie battery voltage in milli-volts and transmission power in 2dBm 2 bytes
Movement counter, incremented by the accelerometer 2 bytes
Measure Sequence number 2 bytes
reserved 2 bytes
CRC8 1 byte
MAC address 6 bytes are allocated because Apple APIs do not provide it

I offer the following observations:
A) Accelerometer values should be provided.
B) Experience with RawV2 format leads to the understanding that a 1 byte movement counter is too small. Even small movements increment the counter by 6 to 12. This causes the counter to wrap too quickly. Two bytes should be sufficient
C) Measurement sequence number of 2 bytes at a transmission period as used by RawV2_slow of 6.240 seconds causes this to wrap after only 4.7 days
D)Partial byte fields are sometimes appropriate depending on the range of values being represented as seen in the power field. Combining bits from different bytes to create a field is not difficult, nor is biasing a field to reduce the value range.
E) Some fields are unnecessarily(even un-scientifically) precise due to the accuracy of the currently used sensors. Both temperature and humidity report varying values from one sample to the next even under static conditions. This precludes interpreting small changes as a trend. Anticipating that alternate sensors might be used in the future it is unreasonable to expect to provide more than tenths of a degree or tenths of a percent of humidity of precision.

I offer the following changes to the proposal:
I) Providing Accelerometer values which represent the orientation of the tag. With a range of 180 degrees, a field width of 5 bits for each of 3 axis would provide a precision of 5.6 degrees while requiring only 15 bits (i.e. less than 2 bytes) which are currently “reserved”.
As seen with battery and TX power it is not that difficult to extract bits and combine them.

Ia)The MAC address for bluetooth devices requires the high order 2 bits always be set. These 2 bits could be used for data with the receiver reconstructing the MAC address with a simple MAC[0] |= MAC[0] + 0xC0. Using the 2 high order bits of the MAC and the remaining bit of “reserved” would increase precision of accelerometer to precision of 2.7 degrees.

II) measurement sequence number: increasing it to 3 bytes may be more than is necessary as with a RawV2_fast period of 1 second would wrap in xxx days.

III) Humidity which is in the range of 0…100 would better be reported in tenths of a percent requiring a max of 0x3E8. This requires only 10 bits allowing the additional 4 bits to be used to extend measurement sequence number.

IV) Up to an additional 4 bits could be obtained by restricting the “currently supported” format codes to use only the second nibble (bits 4-7) or first nibble using 8X for this secure format. This would allow for an additional 7 future formats for a total of 16 active formats. This requires the receiver to mask the first byte of all packets with to determine the format, a very simple operation. Existing applications which support previous formats, without the masking operation, would not match this and future formats which they don’t support.

V) Battery voltage, although interesting, is really worthless for the ultra low power used by the Ruuvi tags. It has been shown to be unable to predict failure. Tags with values seemingly high enough when encountering a significant reduction in temperature stop functioning and tags with very low values continue to operate.

VI) Temperature precision should also be reduced to tenths of a degree centigrade (.18 degrees F).

Bits from item IV, V and VI could be better used to extend other fields or provide fields for future sensors.

I look forward to your responses.
Revised 11/11/19 18:54EST

otso · 12 November 2019 17:46

Thanks for the comprehensive feedback. Can you provide an example of the data format? Please keep in mind that the actual encrypted portion can be only 16 bytes long.

This has been improved a lot ever since the voltage measurement was synchronized to radio activity, but it is true that tags which experience large temperature swings can be unpredictable.

dgerman · 12 November 2019 20:18

VIa) Temperature should be biased and stored in the packet as a positive number. Using a signed number is only space efficient if the absolute value of the lowest and highest values are similar. The range of temperature is -40…85 degrees Centigrade for a total range of 125. Providing a precision of .125 degrees C (.225F) results in maximum value of 125*8 which can be represented in 10 bits.

VII) If a whitelist of MACs can be established as a requirement, the packet need only contain only two low order bytes of the MAC address to provide a deployment of more than 65,000 tags and still be able to identify them uniquely. If an organization happened to get 2 tags with the same 2 low order bytes of MAC address ( .0015% chance) one could be exchanged. This does requiring knowing the MAC address in order to create the whitelist. This has been an issue in the past, especially for Apple devices.

VII) Is a CRC necessary since, the decrypted packet must result in a (partial) whitelisted MAC address to be valid?

otso · 13 November 2019 05:45

There is always a tradeoff between simplicity and efficiency of unpacking data. I’d prefer to use whole bytes instead of bit-packed data for ease of community adaptation. Having standard formats, such as 2-complement signed integer allows developers to use existing functions rather than developing their own unpacking for each data field.

The MAC address is mandatory for iOS compatibility. CRC is useful especially as the bluetooth scanners occasionally produce corrupt data as seen in various posts about spikes in the received data.

theBASTI0N · 26 November 2019 01:50

Hey,

Couldn’t there be two new packet formats,

The FW would then have a counter similar to measurement counter.

When counter is % 2 and remainder == 0 then packet A would be sent,

else packet B would be sent. This could be placed in the main sensor task much like the mode change is done.

for example:

if(counter%2==0)
  {
    encodeToPACKETA(data_buffer, &data, acceleration_events, BLE_TX_POWER);
      break;
  }
else
{
    encodeToPACKETB(&data, BLE_TX_POWER);
      break;
}

Would replace:

switch(tag_mode)
  {
    case RAWv2_FAST:
    case RAWv2_SLOW:
      encodeToRawFormat5(data_buffer, &data, acceleration_events, BLE_TX_POWER);
      break;
    
    case RAWv1:
    default:
      encodeToRawFormat3(data_buffer, &data);
      break;
  }

Frequency of data would be slightly lowered, but often 1seconds interval isn’t required unless looking at vibration.

Scrin · 26 November 2019 05:37

I’m like 90% sure I had already posted this, but apparently not. Perhaps it was somewhere else, Slack maybe, but clearly not here so here it goes.

One other option I have been thinking about in my head is a “dynamic data format” which could hold arbitrary data; there would be a “header byte”, a bitmask, containing the information what the actual sent measurements are, and those measurements would then be in order in the “actual payload” (rather than fixed offsets like they are in the current data formats).

This haven’t really been necessary with unencrypted advertisements as there’s plenty of space so I kind of forgot about this idea I had for a while, but with an encrypted advertisement with very limited amount of usable data it could be useful.

This would allow all existing data to be sent encrypted, and advanced firmwares could even be keeping track of the rate of change and adjust the priorities based on that. ie. if the temperature and humidity is changing rapidly but the acceleration barely changes (ie. the tag is in a sauna, not moving), the tag could be including temperature and humidity more frequently in the advertisement than the acceleration.

For example:

Considering the data available with format 5, the bitmask could perhaps signify:

0 Temperature
1 Humidity
2 Pressure
3 Acceleration (all axes, it's rarely useful to include only one)
4 Power info (battery + tx power)
5 Movement counter
6 Measurement sequence number
7 Something new? Perhaps "config data", ie. used IIR filtering, oversampling, etc

So a “header byte” (bitmask) of 10011000 would include temperature, acceleration and power info. So in this case the payload would be structured as:

Offset  Data
0       Data format
1       Data flags (bitmask)
2-3     Temperature
4-9     Acceleration (x, y, z)
10-11   Power info

Of course even the encrypted payload could hold more data, but for sake of simplicity I included only 3 measurements. Or perhaps some advertisements could be intentionally kept short if that saves any noticeable amount of battery?

A dynamic format like this would even allow (runtime?) “configurable firmwares” to work “efficiently”, ie. if someone is interested only in pressure, right now they would either need to create their own data format or send all values of some existing data format.

For simplicity, the “default/example firmware” for this data format could be using a fixed bitmask, with the currently proposed values, or perhaps altering between two different bitmasks to support all measurements like @theBASTI0N suggested. The more advanced “dynamic behavior” and/or “runtime configurability” could be left for future improvements and/or custom firmwares.

otso · 26 November 2019 10:10

It’s definitely possible, but sending two different packets consumes double energy for the same data rate.

3.26+ versions of Ruuvi Firmware already do this internally, first 4 bytes are u32 which contains the bitmask and for each bit set the data has one float which represents the value.

GATT communication likewise consumes trades bandwidth for simplicity and sends each data field separately with it’s own header.

theBASTI0N · 27 November 2019 00:29

the application would only send one packet at a time, so packet A would be sent on the first second then packet b the next seconds and so on.

So the new packets could include:

Packet A

Packet format number: 1 byte example 06
Temperature in 0.005 degrees Centigrade 2 bytes
Humidity in 0.0025% 2 bytes
Atmospheric pressure in 1Pa 2 bytes
X 2 bytes
Y 2 bytes
Z 2 byts
Measure Sequence number 2 bytes
CRC8 1 byte
MAC address 6 bytes are allocated because Apple APIs do not provide it

PacketB

Packet format number: 1 byte example 07
X 2 bytes
y 2 bytes
Z 2 bytes
Power information, ie battery voltage in milli-volts and transmission power in 2dBm 2 bytes
Movement counter, incremented by the accelerometer 2 bytes
Measure Sequence number 2 bytes
CRC8 1 byte
MAC address 6 bytes are allocated because Apple APIs do not provide it

theBASTI0N · 29 November 2019 04:42

One issue i found when testing alternative packets is keeping them the same length will allow the switch between each format easier as advertising_sizes will remain the same.

PACKET B should be:

PacketB

Packet format number: 1 byte example 07
00 2 bytes left blank could be used for something
X 2 bytes
y 2 bytes
Z 2 bytes
Power information, ie battery voltage in milli-volts and transmission power in 2dBm 2 bytes
Movement counter, incremented by the accelerometer 2 bytes
Measure Sequence number 2 bytes
CRC8 1 byte
MAC address 6 bytes are allocated because Apple APIs do not provide it

This would make Packet A 22 bytes and Packet B 22 bytes removing the 00 that would be added onto the end of Packet B otherwise.