{
"extension": ".py",
"source": "import logging\nimport re\nfrom binascii import unhexlify\n\nfrom ctypes import c_ushort\nfrom decimal import Decimal\n\nfrom dlms_cosem.connection import XDlmsApduFactory\nfrom dlms_cosem.protocol.xdlms import GeneralGlobalCipher\n\nfrom dsmr_parser.objects import MBusObject, MBusObjectPeak, CosemObject, ProfileGenericObject, Telegram\nfrom dsmr_parser.exceptions import ParseError, InvalidChecksumError\nfrom dsmr_parser.value_types import timestamp\n\nlogger = logging.getLogger(__name__)\n\n\nclass TelegramParser(object):\n crc16_tab = []\n\n def __init__(self, telegram_specification, apply_checksum_validation=True):\n \"\"\"\n :param telegram_specification: determines how the telegram is parsed\n :param apply_checksum_validation: validate checksum if applicable for\n telegram DSMR version (v4 and up).\n :type telegram_specification: dict\n \"\"\"\n self.apply_checksum_validation = apply_checksum_validation\n self.telegram_specification = telegram_specification\n # Regexes are compiled once to improve performance\n self.telegram_specification_regexes = {\n object[\"obis_reference\"]: re.compile(object[\"obis_reference\"], re.DOTALL | re.MULTILINE)\n for object in self.telegram_specification['objects']\n }\n\n def parse(self, telegram_data, encryption_key=\"\", authentication_key=\"\", throw_ex=False): # noqa: C901\n \"\"\"\n Parse telegram from string to dict.\n The telegram str type makes python 2.x integration easier.\n\n :param str telegram_data: full telegram from start ('/') to checksum\n ('!ABCD') including line endings in between the telegram's lines\n :param str encryption_key: encryption key\n :param str authentication_key: authentication key\n :rtype: Telegram\n :raises ParseError:\n :raises InvalidChecksumError:\n \"\"\"\n\n if \"general_global_cipher\" in self.telegram_specification:\n if self.telegram_specification[\"general_global_cipher\"]:\n enc_key = unhexlify(encryption_key)\n auth_key = unhexlify(authentication_key)\n telegram_data = unhexlify(telegram_data)\n apdu = XDlmsApduFactory.apdu_from_bytes(apdu_bytes=telegram_data)\n if apdu.security_control.security_suite != 0:\n logger.warning(\"Untested security suite\")\n if apdu.security_control.authenticated and not apdu.security_control.encrypted:\n logger.warning(\"Untested authentication only\")\n if not apdu.security_control.authenticated and not apdu.security_control.encrypted:\n logger.warning(\"Untested not encrypted or authenticated\")\n if apdu.security_control.compressed:\n logger.warning(\"Untested compression\")\n if apdu.security_control.broadcast_key:\n logger.warning(\"Untested broadcast key\")\n telegram_data = apdu.to_plain_apdu(enc_key, auth_key).decode(\"ascii\")\n else:\n try:\n if unhexlify(telegram_data[0:2])[0] == GeneralGlobalCipher.TAG:\n raise RuntimeError(\"Looks like a general_global_cipher frame \"\n \"but telegram specification is not matching!\")\n except Exception:\n pass\n else:\n try:\n if unhexlify(telegram_data[0:2])[0] == GeneralGlobalCipher.TAG:\n raise RuntimeError(\n \"Looks like a general_global_cipher frame but telegram specification is not matching!\")\n except Exception:\n pass\n\n if self.apply_checksum_validation and self.telegram_specification['checksum_support']:\n self.validate_checksum(telegram_data)\n\n telegram = Telegram()\n\n for object in self.telegram_specification['objects']:\n pattern = self.telegram_specification_regexes[object[\"obis_reference\"]]\n matches = pattern.findall(telegram_data)\n\n # Some signatures are optional and may not be present,\n # so only parse lines that match\n for match in matches:\n try:\n dsmr_object = object[\"value_parser\"].parse(match)\n except ParseError:\n logger.error(\n \"ignore line with signature {}, because parsing failed.\".format(object[\"obis_reference\"]),\n exc_info=True\n )\n if throw_ex:\n raise\n except Exception as err:\n logger.error(\"Unexpected {}: {}\".format(type(err), err))\n raise\n else:\n telegram.add(\n obis_reference=object[\"obis_reference\"],\n dsmr_object=dsmr_object,\n obis_name=object[\"value_name\"]\n )\n\n return telegram\n\n @staticmethod\n def validate_checksum(telegram):\n \"\"\"\n :param str telegram:\n :raises ParseError:\n :raises InvalidChecksumError:\n \"\"\"\n\n # Extract the part for which the checksum applies.\n checksum_contents = re.search(r'\\/.+\\!', telegram, re.DOTALL)\n\n # Extract the hexadecimal checksum value itself.\n # The line ending '\\r\\n' for the checksum line can be ignored.\n checksum_hex = re.search(r'((?<=\\!)[0-9A-Z]{4})+', telegram)\n\n if not checksum_contents or not checksum_hex:\n raise ParseError(\n 'Failed to perform CRC validation because the telegram is '\n 'incomplete. The checksum and/or content values are missing.'\n )\n\n calculated_crc = TelegramParser.crc16(checksum_contents.group(0))\n expected_crc = int(checksum_hex.group(0), base=16)\n\n if calculated_crc != expected_crc:\n raise InvalidChecksumError(\n \"Invalid telegram. The CRC checksum '{}' does not match the \"\n \"expected '{}'\".format(\n calculated_crc,\n expected_crc\n )\n )\n\n @staticmethod\n def crc16(telegram):\n \"\"\"\n Calculate the CRC16 value for the given telegram\n\n :param str telegram:\n \"\"\"\n crcValue = 0x0000\n\n if len(TelegramParser.crc16_tab) == 0:\n for i in range(0, 256):\n crc = c_ushort(i).value\n for j in range(0, 8):\n if (crc & 0x0001):\n crc = c_ushort(crc >> 1).value ^ 0xA001\n else:\n crc = c_ushort(crc >> 1).value\n TelegramParser.crc16_tab.append(hex(crc))\n\n for c in telegram:\n d = ord(c)\n tmp = crcValue ^ d\n rotated = c_ushort(crcValue >> 8).value\n crcValue = rotated ^ int(TelegramParser.crc16_tab[(tmp & 0x00ff)], 0)\n\n return crcValue\n\n\nclass DSMRObjectParser(object):\n \"\"\"\n Parses an object (can also be see as a 'line') from a telegram.\n \"\"\"\n\n def __init__(self, *value_formats):\n self.value_formats = value_formats\n\n def _is_line_wellformed(self, line, values):\n # allows overriding by child class\n return (values and (len(values) == len(self.value_formats)))\n\n def _parse_values(self, values):\n # allows overriding by child class\n return [self.value_formats[i].parse(value)\n for i, value in enumerate(values)]\n\n def _parse_obis_id_code(self, line):\n \"\"\"\n Get the OBIS ID code\n\n Example line:\n '0-2:24.2.1(200426223001S)(00246.138*m3)'\n\n OBIS ID code = 0-2 returned as tuple\n \"\"\"\n try:\n return int(line[0]), int(line[2])\n except ValueError:\n raise ParseError(\"Invalid OBIS ID code for line '%s' in '%s'\", line, self)\n\n def _parse(self, line):\n # Match value groups, but exclude the parentheses\n pattern = re.compile(r'((?<=\\()[0-9a-zA-Z\\.\\*\\-\\:]{0,}(?=\\)))')\n\n values = re.findall(pattern, line)\n\n if not self._is_line_wellformed(line, values):\n raise ParseError(\"Invalid '%s' line for '%s'\", line, self)\n\n # Convert empty value groups to None for clarity.\n values = [None if value == '' else value for value in values]\n\n return self._parse_values(values)\n\n\nclass MBusParser(DSMRObjectParser):\n \"\"\"\n Gas meter value parser.\n\n These are lines with a timestamp and gas meter value.\n\n Line format:\n 'ID (TST) (Mv1*U1)'\n\n 1 2 3 4\n\n 1) OBIS Reduced ID-code\n 2) Time Stamp (TST) of capture time of measurement value\n 3) Measurement value 1 (most recent entry of buffer attribute without unit)\n 4) Unit of measurement values (Unit of capture objects attribute)\n \"\"\"\n\n def parse(self, line):\n return MBusObject(\n obis_id_code=self._parse_obis_id_code(line),\n values=self._parse(line)\n )\n\n\nclass MaxDemandParser(DSMRObjectParser):\n \"\"\"\n Max demand history parser.\n\n These are lines with multiple values. Each containing 2 timestamps and a value\n\n Line format:\n 'ID (Count) (ID) (ID) (TST) (TST) (Mv1*U1)'\n\n 1 2 3 4 5 6 7\n\n 1) OBIS Reduced ID-code\n 2) Amount of values in the response\n 3) ID of the source\n 4) ^^\n 5) Time Stamp (TST) of the month\n 6) Time Stamp (TST) when the max demand occured\n 6) Measurement value 1 (most recent entry of buffer attribute without unit)\n 7) Unit of measurement values (Unit of capture objects attribute)\n \"\"\"\n\n def parse(self, line):\n pattern = re.compile(r'((?<=\\()[0-9a-zA-Z\\.\\*\\-\\:]{0,}(?=\\)))')\n values = re.findall(pattern, line)\n\n obis_id_code = self._parse_obis_id_code(line)\n\n objects = []\n\n count = int(values[0])\n for i in range(1, count + 1):\n timestamp_month = ValueParser(timestamp).parse(values[i * 3 + 0])\n timestamp_occurred = ValueParser(timestamp).parse(values[i * 3 + 1])\n value = ValueParser(Decimal).parse(values[i * 3 + 2])\n objects.append(MBusObjectPeak(\n obis_id_code=obis_id_code,\n values=[timestamp_month, timestamp_occurred, value]\n ))\n\n return objects\n\n\nclass CosemParser(DSMRObjectParser):\n \"\"\"\n Cosem object parser.\n\n These are data objects with a single value that optionally have a unit of\n measurement.\n\n Line format:\n ID (Mv*U)\n\n 1 23 45\n\n 1) OBIS Reduced ID-code\n 2) Separator \"(\", ASCII 28h\n 3) COSEM object attribute value\n 4) Unit of measurement values (Unit of capture objects attribute) - only if\n applicable\n 5) Separator \")\", ASCII 29h\n \"\"\"\n\n def parse(self, line):\n return CosemObject(\n obis_id_code=self._parse_obis_id_code(line),\n values=self._parse(line)\n )\n\n\nclass ProfileGenericParser(DSMRObjectParser):\n \"\"\"\n Power failure log parser.\n\n These are data objects with multiple repeating groups of values.\n\n Line format:\n ID (z) (ID1) (TST) (Bv1*U1) (TST) (Bvz*Uz)\n\n 1 2 3 4 5 6 7 8 9\n\n 1) OBIS Reduced ID-code\n 2) Number of values z (max 10).\n 3) Identifications of buffer values (OBIS Reduced ID codes of capture objects attribute)\n 4) Time Stamp (TST) of power failure end time\n 5) Buffer value 1 (most recent entry of buffer attribute without unit)\n 6) Unit of buffer values (Unit of capture objects attribute)\n 7) Time Stamp (TST) of power failure end time\n 8) Buffer value 2 (oldest entry of buffer attribute without unit)\n 9) Unit of buffer values (Unit of capture objects attribute)\n \"\"\"\n\n def __init__(self, buffer_types, head_parsers, parsers_for_unidentified):\n self.value_formats = head_parsers.copy()\n self.buffer_types = buffer_types\n self.parsers_for_unidentified = parsers_for_unidentified\n\n def _is_line_wellformed(self, line, values):\n if values and (len(values) == 1) and (values[0] == ''):\n # special case: single empty parentheses (indicated by empty string)\n return True\n\n if values and (len(values) >= 2) and (values[0].isdigit()):\n buffer_length = int(values[0])\n return (buffer_length <= 10) and (len(values) == (buffer_length * 2 + 2))\n else:\n return False\n\n def _parse_values(self, values):\n if values and (len(values) == 1) and (values[0] is None):\n # special case: single empty parentheses; make sure empty ProfileGenericObject is created\n values = [0, None] # buffer_length=0, buffer_value_obis_ID=None\n buffer_length = int(values[0])\n buffer_value_obis_ID = values[1]\n if (buffer_length > 0):\n if buffer_value_obis_ID in self.buffer_types:\n bufferValueParsers = self.buffer_types[buffer_value_obis_ID]\n else:\n bufferValueParsers = self.parsers_for_unidentified\n # add the parsers for the encountered value type z times\n for _ in range(buffer_length):\n self.value_formats.extend(bufferValueParsers)\n\n return [self.value_formats[i].parse(value) for i, value in enumerate(values)]\n\n def parse(self, line):\n return ProfileGenericObject(\n obis_id_code=self._parse_obis_id_code(line),\n values=self._parse(line)\n )\n\n\nclass ValueParser(object):\n \"\"\"\n Parses a single value from DSMRObject's.\n\n Example with coerce_type being int:\n (002*A) becomes {'value': 2, 'unit': 'A'}\n\n Example with coerce_type being str:\n (42) becomes {'value': '42', 'unit': None}\n \"\"\"\n\n def __init__(self, coerce_type):\n self.coerce_type = coerce_type\n\n def parse(self, value):\n unit_of_measurement = None\n\n if value and '*' in value:\n value, unit_of_measurement = value.split('*')\n\n # A value group is not required to have a value, and then coercing does\n # not apply.\n value = self.coerce_type(value) if value is not None else value\n\n return {\n 'value': value,\n 'unit': unit_of_measurement\n }\n",
"review": "# 7\n\n## Bugs\n- The `crc16_tab` is mutable but populated only once, thus it is vulnerable to potential side-effects or race conditions if accessed in a multi-threaded context.\n- In the `ProfileGenericParser`, the logic assumes `values` length is correctly calculated; however, inadequate handling when regex fails to capture suitable groups is present.\n- The exception handling for hex operations could provide more specific cases rather than catching all Exceptions.\n\n## Optimizations\n- Preloading and reusing the compiled regex patterns across other parsing methods could improve performance and consistency.\n- The `crc16` method tabulation could be optimized or replaced with an alternative optimized library for CRC calculation.\n- Repeated warning checks for unhandled `security_suite` cases could be refactored into a helper method for better code reuse and maintainability.\n- Consider using a more modern Python class inheritance structure by extending directly from `object`.\n\n## Good Points\n- The use of logging to aid error-tracking and debugging.\n- Modular approach with multiple parsing classes, each with a specific responsibility.\n\n## Summary\nThe overall design of the code is modular, which is advantageous for maintainability. However, there are areas where optimizations, especially related to performance and security handling, could be implemented. Bugs are mostly related to error handling and potential issues when operating in a multi-threaded environment. Refactoring for more modern Python practices could enhance code readability and performance.\n\n## Open source alternatives\n- **PyCRC**: It offers common CRC algorithms like CRC16, which can be used instead of manually performing the CRC calculations.\n- **OpenDSS**: Data parsing and manipulation for smart grids and DSMR-related purposes.\n- **pyOBIS**: A library specifically for decoding OBIS (Object Identification System) codes.",
"filename": "parsers.py",
"path": "dsmr_parser/parsers.py",
"directory": "dsmr_parser",
"grade": 7,
"size": 14360,
"line_count": 405
}