Morphex's blogologue (Life, technology, music, politics, business, mental health and more)

Share on Facebook Share on Google+ Share on Twitter Share on LinkedIn

C & XML

These last couple of months I've been learning a bit of Assembler and C programming, as these days I have the time available. I've always found Python and other high-level languages fast enough for what I needed to do, but I've always wondered a bit about C and Assembler.

What I've learned so far is that the computer is in fact a very large calculator, and pretty much everything that happens is that instructions are called (for example adding two numbers), and that numbers are moved around in memory, disk, peripherals etc. I've found it useful to learn about Assembler and C because it gives me a more detailed and correct view of how things work in computing.

With my programming and system administration background, I found it easy to dive into C and Assembler, and I also appreciate a lot more what for example Python does as a high-level programming language.

I've been looking for some gig or project to create a C and Assembler project for, and what I've landed on so far is that I want to create an XML parser. An XML parser that validates the Unicode used, as well as insures that the document is "well formed". I haven't gotten that far yet, but I've pretty much decided that the parser should (for now at least) be restricted to an UTF-32-LE encoding, and that whenever I work with pointers the rule is to initialize to null when they are created as well as after free() has been called.

I think this is good fun and I do it whenever I have the time and energy, here's the code so far:

#include <stdlib.h>
#include <stdio.h>

int main() {
  char *buffer = NULL;
  int read = 0;
  buffer = malloc(1024*sizeof(char));
  FILE *file = NULL;
  file = fopen("test.xml.2", "rb+");
  read = fread(buffer, sizeof(char), 1024, file);
  if ((char)buffer[0] == (char)0xFF && (char)buffer[1] == (char)0xFE &&
      (char)buffer[2] == (char)0x00 && (char)buffer[3] == (char)0x00) {
    // We have a UTF-32-LE Byte Order Mark                                       
    printf("BOM found\n");
  } else {
    printf("BOM not found, %x\n", buffer[0]);
    exit(1);
  }
  printf("%i\n", read);
  fwrite(buffer, read, 1, stdout);
  printf("\n");
  free(buffer); buffer = NULL;
  return 0;
}

[Permalink] [By morphex] [Technology (Atom feed)] [09 May 11:44 Europe/Oslo]

Morphex's blogologue (Life, technology, music, politics, business, mental health and more)

Morphex's Blogodex

Older entries

C & XML