Restructuring the database

Since Sunday, I’m restructuring the database. The problem that appear Sunday is a position with a longer FEN than what the database could accept. About a year and half, and I analyzed the database to look at the longer FEN, it was 57 characters long. In a goal to reduce the size of the database, I reduce the length of the column to 60 characters long. Sunday, a position with a FEN of 61 characters long appear, causing a lot problems.

Right now I’m changing the length of the column, and but it takes much longer time than what I thought. Neca had prove me that even in the suicide variant it’s possible to reach the longest length for a FEN (71 characters).

The process will produce a bigger database, and I wonder if the current hard drive will be big enough. If not I will buy a new one. Currently it’s a 500 Gb.

I hope to solve the issue as soon as possible, don’t be afraid to ask for news by email or chat or twitter.

*I’m not counting the color in the FEN length, it’s keep in a separate column in the database.

5 réflexions sur “Restructuring the database

  1. Why can’t you use a VARCHAR(71) column? That would only make a one byte overhead on the actual length of the string.

    • One byte can seem really small, but a byte by rows (position) that mean about 1 Gb. And I used CHAR instead VARCHAR for performance issue.
      So passing from CHAR(60) to CHAR(71) will increase the database for about 11 Gb.

      • The only case where VARCHAR(71) takes more space that CHAR(71) is if the string has 71 characters. In that case, it will take one more byte. Otherwise, it will take less bytes.

        Assuming the average FEN is about 45 bytes (I really have no idea, but the point still stands in any case), you would save an average of 25 bytes per row.

        I decided to do a little research, and from I saw, there is little to no speed improvement by using CHAR over VARCHAR. There will be minimal improvements when the data in the column has a fixed length. Otherwise, the reduced size of the data makes the cache more efficient and the smaller index makes searching faster.

        I don’t know the exact configuration of your database, but I really think you should consider testing the difference in speed between the two types.

      • I agree that in term of size, the varchar will be better. Only one index is using the fen.

        I consider doing the testing between the difference in speed, because right now the row is a fixed length, but to do the test I gonna a lot of time:
        -Setting a testing server
        -transfering the database (about 450 Gb currently)
        -do a modify version of the table (will took at less 3 days)
        – Figuring real tests
        – Test both database.

        I think I won’t have the choice to do it, since the database won’t stop growing.

  2. Why do you store FEN strings at all? A position can be stored as a a collection of twelve 64-bit integers (12 bytes in modern computers). You need six integers to locate the white pawns, knights, kings, queens, rooks and bishops, another six integers for the black pieces. Side to play is a flag, as is the presence of an en passant square. From this representation a FEN can be created on the fly. Similarly a FEN can be used as input to create the 12 integers you need. You can have stored procedures converting back and forth. Tell me if this doesn’t make sense to you.

Laisser un commentaire

Entrez vos coordonnées ci-dessous ou cliquez sur une icône pour vous connecter:

Logo WordPress.com

Vous commentez à l'aide de votre compte WordPress.com. Déconnexion / Changer )

Image Twitter

Vous commentez à l'aide de votre compte Twitter. Déconnexion / Changer )

Photo Facebook

Vous commentez à l'aide de votre compte Facebook. Déconnexion / Changer )

Photo Google+

Vous commentez à l'aide de votre compte Google+. Déconnexion / Changer )

Connexion à %s