10 stupid tips for MySQL

Você pode ler este post em português também.

Hello!

The April’s post brings 10 stupid tips for day-by-day work with MySQL. Let’s see them:

  1. Use a .my.cnf file in your homedir with user/password data:
    [client]
    user=root
    password=my_p@ss-w0rd
  2. End your commands with G instead of ";": This will bring a “form” view instead the normal tabular view. This is necessary a lot of times;
  3. Materialized views: When the regular views (created with CREATE VIEW) doesn’t offer the expected performance, use triggers to feed tables that could work like the regular views. Being carefull, it helps a lot!
  4. Take care of the parameter ENGINE of your databases and tables: Choose the best considering the R/W relation of the tables;
  5. Adjust your MySQL prompt to your best fit at your .my.cnf in your homedir. The example below shows "mysql:db_name>". There are also the u (to show the username) and h (to show the hostname):
    prompt='mysql:d>'
  6. At your mysql prompt, CTRL+A brings you to the begin of the line. CTRL+E brings you to the end;
  7. To create a t2 table with the same structure of t1 table:
    CREATE TABLE t2 LIKE t1;
  8. To create a user and give its access in one line:
    GRANT ALL ON db_name.* TO 'username'@'host' IDENTIFIED BY 'password'; FLUSH PRIVILEGES;
  9. To generate a CSV file from the shell (better than OUTFILE ’cause this below shows the column names):
    mysql -AD dbname -e "Your SELECT here;" | tr "t" ";" > resultset.csv
  10. And the last, but not least (stupid). To show the queries in execution (the FULL parameter is optional):

    SHOW [FULL] PROCESSLIST;

Memorize them and your day-by-day with MySQL will be easier.

See you!

Benchmarking of attribution methods of PHP

Você pode ler este post em português também.

Hey!!

This post will show you a benchmarking among the attribution ways supported by PHP:

a) Simple attribution (ex.: $j = $j + 1);
b) Right association attribution (ex.: $j += 1);
c) Non-associative attribution (ex.: $j++);

Test material:

  • PC: Dell Optiplex 755;
  • Environment: Bash, without X. Ext4 filesystem;
  • uname -a:
    Linux optiplex755 2.6.34.8-68.fc13.x86_64 #1 SMP Thu Feb 17 15:03:58 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

  • php -v:
    PHP 5.3.5 (cli) (built: Jan 7 2011 18:46:32)
    Copyright (c) 1997-2010 The PHP Group
    Zend Engine v2.3.0, Copyright (c) 1998-2010 Zend Technologies
    with Xdebug v2.1.0, Copyright (c) 2002-2010, by Derick Rethans

All the files used and generated by the test can be downloaded here. The applied methodology follows:
1) Turn off the PC by one minute;
2) @ bash: for i in `seq 1 100`; do echo $i && php t1.php >> t1_output01.log; done;
3) Turn off the PC by one more minute;
4) @ bash: for i in `seq 1 100`; do echo $i && php t1.php >> t1_output02.log; done;
5) Turn off the PC again, by one more minute;
6) @ bash: for i in `seq 1 100`; do echo $i && php t1.php >> t1_output03.log; done;

The t1.php, t2.php and t3.php has each one three log files with its execution times. The 300 results of each script got ordered by by its execution time (ascendant way) and throwed out the 100 best and the 100 worst values, giving to us only the 100 best average execution times.

The result:

Graph - Comparing the execution times
Graph - Comparing the execution times

The graph shows that the use of “($j += 1)” or “($j++)” method can give you 0.02 second of advantage, in average.

We are going to modify our Coding Standards documents to preferring one of advantage methods that I have showed in this comparison, saving even more processing time.

Vacations!!

Você pode ler este post em português também.

To all who follow this blog,

February was a month that I devoted myself entirely to family and vacations, and that’s why I didn’t write posts at February.

And you should know what happens to the amount of work when we went 25 days away in vacations 🙂

ASAP, another article about PHP will be writen. And I want to re-organize my site and blog at all, so…

See you 😉

Beware with flags!!

Você pode ler este post em português também.

Hello!

The first article of this year talks about data modeling.

A situation that makes me angry for years is when you have attributes that work as flags inside tables.

Think twice (or maybe three times) before create them because usually, they are unnecessary. And why? We create them to mark a state of a record, which can be done, many times, querying another attributes, generating duplicity situations (so much problematic, per si). Let’s see examples:

mysql> SELECT * FROM tb_orders LIMIT 5;
+------------ ... +--------------+------------ ... +----------------+
| order_id    ... | sent_method  | sent_date   ... | sent_flag      |
+------------ ... +--------------+------------ ... +----------------+
| 986565      ... | 1            | 2010-11-18  ... | Y              |
| 659813      ... | NULL         | NULL        ... | N              |
| 986542      ... | 2            | 2010-11-13  ... | Y              |
| 321598      ... | 1            | 2010-01-14  ... | Y              |
| 112982      ... | NULL         | NULL        ... | N              |
+------------ ... +--------------+------------ ... +----------------+
5 rows in set (0.00 sec)

mysql>

 

The example above shows us a common situation: a flag to show something which another tuple already show. It’s redundant, and it can, easily, broke the data integrity. An error of code or a simple forget during an insert or update operation can result at the situation below:

mysql> SELECT * FROM tb_orders LIMIT 5;
+------------ ... +--------------+------------ ... +----------------+
| order_id    ... | sent_method  | sent_date   ... | sent_flag      |
+------------ ... +--------------+------------ ... +----------------+
| 986565      ... | 1            | 2010-11-18  ... | Y              |
| 659813      ... | NULL         | NULL        ... | Y              |
| 986542      ... | 2            | 2010-11-13  ... | Y              |
| 321598      ... | 1            | 2010-01-14  ... | Y              |
| 112982      ... | NULL         | NULL        ... | N              |
+------------ ... +--------------+------------ ... +----------------+
5 rows in set (0.00 sec)

mysql>

 

Or even:

mysql> SELECT * FROM tb_pedidos LIMIT 5;
+------------ ... +--------------+------------ ... +----------------+
| order_id    ... | sent_method  | sent_date   ... | sent_flag      |
+------------ ... +--------------+------------ ... +----------------+
| 986565      ... | 1            | 2010-11-18  ... | S              |
| 659813      ... | 1            | 2010-05-12  ... | N              |
| 986542      ... | 2            | 2010-11-13  ... | S              |
| 321598      ... | 1            | 2010-01-14  ... | S              |
| 112982      ... | NULL         | NULL        ... | N              |
+------------ ... +--------------+------------ ... +----------------+
5 rows in set (0.00 sec)

mysql>

 

How we can proceed to treat this kind of situation? The answer is to analyze each flag as follow:
* Verify each flag you want to create. There is another attribute (or a group of them, even in another tables) that be able to show the required situation? If so, maybe this flag is useless, and you can perform the following actions:
* Create this flag as a calculated fields in the analyzed table;
* Create views, and inside them create your flags as (again) calculated fields;
* And finally, use can use triggers to populate theses flags automatically, watching the attribute/tuple that can show the situation.

Sure, there are situations where the flags are essential. In this cases, I clap for them (see the “gender” and “record_active” attributes below):

mysql> SELECT * FROM tb_pessoas LIMIT 5;
+------------ ... +------- ... +------------+ ... +-----------------+ ...
| person_name ... | gender ... | reg_date   | ... | record_active   | ...
+------------ ... +------- ... +------------+ ... +-----------------+ ...
| Carlos      ... | M      ... | 2006-01-18 | ... | S               | ...
| Sandro      ... | M      ... | 2005-01-12 | ... | S               | ...
| Henrique    ... | M      ... | 2008-02-13 | ... | S               | ...
| Luiza       ... | F      ... | 2006-01-14 | ... | S               | ...
| Antonio     ... | M      ... | 2008-01-14 | ... | N               | ...
+------------ ... +------- ... +------------+ ... +-----------------+ ...
5 rows in set (0.00 sec)

mysql>

 

A good tip would be to ask the following question to yourself before to create an attribute of indicator/flag type: “This attribute will works to show a situation whose state can change during the life of the record?” If the answer tends to YES then imagine the semaphore should be, at least, in yellow. Then proceed with caution and consider whether you can instead of create more one physical attribute, to create a calculated field, a view or a trigger to populate it automatically.

Do you agree/disagree or have something to add? Comment!!

See you! 😉

How to be a good IT professional

Você pode ver este post em português também.

I’ll continue below this post of Elton Luís Minetto, complementing your ideas about everything I learned on IT environments. Here we go:

  1. Graduate yourself. I know people that leave the graduation behind and are going well, and I can garantee that this kind of people is rare. Different from @eminetto, I joined to the graduation at 2004 with some experience on IT (I started at 1997). And even tought the graduation expanded (better, it did try and applied) knowledge on fields which I had never heard before, and it was really valuable;
  2. Know the business target. Spend some hours of your week following the work of the people that orbitate your systems. Yes, stop to hunt bits and bytes and try to understand the ecosystem around them;
  3. Do the good, the best can be done later. Stop to spend time (and resources) to deliver that piece of software (or a change, or that model, or even that architecture, etcetera.) in a great degree of details. Reach the required goal. If you did it, deliver it. You can organize yourself to, at intervals of time, review your work and fits it to the current necessity (ok, I admit I’m still looking then this item 100% of the time…);
  4. Fall in love with your work. Nowadays I believe that is the only way to make us to think if we can do something that we already do, better. Look for more details about the programming language you are using. Learn more about the database systems. Learn about standards (seriously, there are standards for many things in software and use them helps a lot!). Learn more about the business to which your software meets;
  5. Serve the man, not to machines. This small excerpt of the vow of the engineer says a lot to me. I’m a GNU/Linux user since 2001, but I offered a lot of support for MS-Windows, I taught classes about MS-Office, I managed several MS-Windows servers, I used MS-Windows a lot. I’ve programmed with Clipper, C, Delphi, C#, ASP.NET, ASP, and nowadays I’m a PHP programmer. I’ve worked with Progress, MS-SQL Server, and nowadays I’m currently working with Oracle and MySQL. I’ve integrated systems with EDI (come on!!), today the fashion is Web services and REST. The way (technology) is important, but don’t forget the purpose, objectives and requirements that must be met. For the customer, is what matters;
  6. Documentation is, definitely, important. When your boss (or yourself) needs to make a presentation or discuss an integration, new features, boundaries, processes, etc.. with people who have no knowledge about your code, you will agree with me. Unfortunately, nowadays, the pain is the only way that I see to make someone understand this…

OK, that’s it. I suggest, so insistent, that you read the post from Elton. There are ideas there that I just tried to keep there, avoiding repetitions.

See you! 😉

Load Balancing with DNS

Você pode ler este post em português também

Hello!

The article of this month will show how is easy to balance the load of an application using your DNS.

However, you must be able to do the following actions:

  1. Take from your web server the responsibility of storing the session data of your application;
  2. To have access to the DNS server that holds the records of your domain;
  3. To have more than one web server to holds the requests (obvious, but it’s good to say…).

If your application is written in PHP, you can treat the first item above trough the session_set_save_handler. It’s beyond the scope of this post to show how to do this, but a visit in PHP website will detail the use.

The big trick is really on the configuration at DNS server (2nd item). You must configure, to the same A record of your application, the both (or more) IP address which can handle the requests. The piece of code bellow has been taken from a zone configuration file of a DNS server:


ns1 IN A 192.168.0.1
www IN A 192.168.0.2
mta IN A 192.168.0.3
app IN A 192.168.0.4
app IN A 192.168.0.5

This feature of balancing the hits of a host to different IP addresses at DNS server could be done via SRV records.

The screenshot below shows us (by Apache logs) that both servers are treating the hits.

Application Load Balancing with DNS
Application Load Balancing with DNS

The scenario will complicate a little bit more if your application needs to receive data via upload, which will implies in a common place to save the files.

See you 😉

TRAC on Fedora with 5 steps

Você pode ler este post em português também.

Hello!

This month I’ll show you how to perform a clean installation of the TRAC on Fedora.

TRAC is the best open source SCM tool that I’ve listened. The Fedora distribution is a widely used GNU/Linux.

At the end of this article I’ll also show how to use it with the GIT, making it a version control system too.

 

Let’s do it!

  1. Packages installation:
    • su -
    • yum update
    • yum install wget python python-devel python-setuptools python-genshi python-docutils python-pygments policycoreutils-python
    • yum install httpd httpd-tools mod_python
    • yum install python-offtrac trac trac-accountmanager-plugin trac-customfieldadmin-plugin trac-doxygen-plugin trac-iniadmin-plugin trac-privateticketsplugin trac-ticketdelete-plugin trac-tracnav-plugin trac-xmlrpc-plugin
  2. Creation of the TRAC environment on filesystem. We’ll call the base directory of your project of BASE_DIR (that’d be, for instance, /var/www/html/nome_projeto/scm). The commands semanage and restorecon give to the TRAC files the same context of Apache to the SeLinux. Obviously, you can ignore them if your SeLinux is disabled:
    • trac-admin BASE_DIR initenv
    • chown -R apache.apache BASE_DIR
    • semanage fcontext -at httpd_sys_content_t "BASE_DIR(/.*)?"
    • semanage fcontext -at httpd_sys_content_rw_t "BASE_DIR/attachments(/.*)?"
    • semanage fcontext -at httpd_sys_content_rw_t "BASE_DIR/conf(/.*)?"
    • semanage fcontext -at httpd_sys_content_rw_t "BASE_DIR/db(/.*)?"
    • semanage fcontext -at httpd_sys_content_rw_t "BASE_DIR/log(/.*)?"
    • chown -R apache:apache BASE_DIR
    • restorecon -R BASE_DIR
  3. Creation of the user which will does the administration of the TRAC environment for the project. We’ll call this user ADMIN_USER:
    • htpasswd -c BASE_DIR/conf/trac.htpasswd ADMIN_USER
    • trac-admin BASE_DIR
    • permission add ADMIN_USER TRAC_ADMIN
  4. To modify the Apache configuration. First, we’ll add an alias and a directory tag pointing to the base directory of TRAC (ScriptAlias e Directory). After, we’ll configure the specific directives for the environment (Location). Just to remind you, these directives could be, if necessary, inside a VirtualHost container:
    • ScriptAlias /trac/ "BASE_DIR/"
      <Directory "BASE_DIR">
      AllowOverride None
      Options None
      Order allow,deny
      Allow from all
      </Directory>
      
      <Location /trac>
      SetHandler mod_python
      PythonHandler trac.web.modpython_frontend
      PythonOption TracEnv BASE_DIR
      PythonOption TracUriRoot /trac
      SetEnv PYTHON_EGG_CACHE /tmp
      PythonInterpreter trac
      </Location>
      
      <Location /trac/login>
      AuthType Basic
      AuthName "Name of the project"
      AuthUserFile BASE_DIR/conf/trac.htpasswd
      Require valid-user
      </Location>
  5. The last step is to restart the Apache:
    • service httpd restart

Everything is done! Point your browser to access the TRAC environment (maybe the famous http://localhost/trac/) e have fun!

 

Git

If you want use GIT as a plugin of TRAC, do the steps below, additionaly:

  • Install GIT and the git plugin of the TRAC:
    • yum install git GitPython
    • yum install trac-git-plugin
  • Modify the file BASE_DIR/conf/trac.ini as follow:
    • The line repository_dir should point to same directory of the HEAD file of your repository (normally REPO_DIR/.git);
    • Add the lines below to the end of file:
      • [components]
      • tracext.git.* = enabled
  • The source directory must have its read and write permissions assigned to Apache.

The five steps above will give you a TRAC environment ready for use. To improve it and adapt it to your needs, visit the website TracHacks where there are many plugins for download.

See you 😉

Você pode ler este post em português também.

Prioritization of demands by voting

Você pode ler este post em português também.

Our development team did do an interesting experiment to elect the most critical demands, those ones that needed to be built first. To reach this, we created a spreadsheet that, although our creation, brought ideas from other worksheets that came across to us here in the company.

We took advantage of a forum that happened among 30 people, all heavy-users of our software. In it, our main tool was a spreadsheet (link at end of article), whose tabs are described below:

  • DEMANDS: A simple list of demands with the improvements requested by users. It’s the result of months of informal conversations, emails, phone calls etc. It has an ID column to make the lookup easier, a title and a brief description. This spreadsheet was printed only once, and stayed with me throughout the work;
  • FORM: Almost the same of the DEMANDS worksheet, but with an empty column where each participant will fill out your vote. It was printed in sufficient number to have one for each participant;
  • FEEDBACKS: The lines, again, are the demands. Each column represents a person present in the forum. The intersection is the vote of each one to the demand. We ask the participants a rating from 1 to 10, corresponding to the size of the benefit this demand will brings to their area, if it be built. The last column of this spreadsheet is a simple average of the votes between 1 and 10. Thus, if the demand is beyond the scope of work of the participant, he/she can vote 0 (zero) and your vote will be ignored in the average;
  • CALC: Here’s the trick. Days before this forum, we did the same job with the developers. In this worksheet is only computed the averages given by them to every demand. The developers rated from 1 to 10 corresponding to the size of the effort that the demand would have to be built. Then we applied a tiny part of AI (fuzz / defuzz) to classify the effort between “Little” and “Much”. The other two columns do the same with the attribute voted by the participants. The first column only replicates the values from FEEDBACKS worksheet, and the second one classify the value between “Little” and “Much”;
  • GRAPH: Its a quadrant that arranges the demands inside groups according to their benefit and effort, as below:

Chart example

To fill out the worksheet FEEDBACKS, we paused the presentation by one hour, giving time for another presentation to happen.

The result of this work is the best. It’s the match of your users most want / need, with that the developers say be easier to build first. Thus, in the quadrants of the graph, you will notice which clearly demands bring much benefit with little effort, for instance. This adds value to development, customer relationship, and here, has brought more excitement and satisfaction to the developers.

Now I look forward opinions about where in the process this worksheet will bring more help. Here we used in the maintenance phase. I believe which in new projects, the greatest gain would come in the phase of transition of the product, where there is a certain level of knowledge between the supplier and customer.

The workbook is published in XLS and will work normally in OpenOffice Calc. This is hosted on GitHub, and you can get it clicking here.

See you 😉

MySQL: Copy only the structure of table

Você pode ler este post em português também.

Hello!

Last month I had to copy the structure of a table to a new one. More than two people were surprised with the way and ease which I did it, so I decided to post it here instead of another performance comparison inside PHP.

Below is an example. Enjoy and feel free to share ideas about it.

[arglbr@t64 ~]$ mysql -AD test
Welcome to the MySQL monitor. Commands end with ; or g.
Your MySQL connection id is 20
Server version: 5.1.47 Source distribution

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

mysql> CREATE TABLE `test`.`t1` (
-> `id` SMALLINT UNSIGNED NOT NULL AUTO_INCREMENT,
-> `name` VARCHAR(50) NOT NULL,
-> PRIMARY KEY (`id`),
-> INDEX `idx_01`(`name`)
-> ) ENGINE = InnoDB;
Query OK, 0 rows affected (0.10 sec)

mysql> insert into t1 values (1, 'Adriano'), (2, 'Andréa'), (3, 'Arlete');
Query OK, 3 rows affected (0.04 sec)
Records: 3 Duplicates: 0 Warnings: 0

mysql> select * from t1;
+----+---------+
| id | name |
+----+---------+
| 1 | Adriano |
| 2 | Andréa |
| 3 | Arlete |
+----+---------+
3 rows in set (0.00 sec)

mysql> create table t2 like t1;
Query OK, 0 rows affected (0.10 sec)

mysql> select * from t2;
Empty set (0.00 sec)

mysql> show create table t1;
+-------+---------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------+
| t1 | CREATE TABLE `t1` ( |
| `id` smallint(5) unsigned NOT NULL AUTO_INCREMENT, |
| `name` varchar(50) NOT NULL, |
| PRIMARY KEY (`id`), |
| KEY `idx_01` (`name`) |
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 |
+-------+---------------------------------------------------+
1 row in set (0.00 sec)

mysql> show create table t2;
+-------+---------------------------------------------------+
| Table | Create Table |
+-------+---------------------------------------------------+
| t2 | CREATE TABLE `t2` ( |
| `id` smallint(5) unsigned NOT NULL AUTO_INCREMENT, |
| `name` varchar(50) NOT NULL, |
| PRIMARY KEY (`id`), |
| KEY `idx_01` (`name`) |
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+---------------------------------------------------+
1 row in set (0.00 sec)

mysql>

As you can see, it’s an interesting method because it gets all the structure (índex, engine, encoding) e naturally, reset the AUTO_INCREMENT atribute.

Cheers!

The pursuit of (happy|plain)ness

Você pode ler este post em português também.

Hello!

Write and maintain a code should be a pleasure task. Forget about complicated code, difficult things to be understood by human being.

If your program requires concepts that go beyond the classic algorithms, use comments to alert the (or that) future maintainer(s) about what he (they) need to know to maintain the code. Helps a lot to think that next maintainer would be someone else. Consider the simple story of that old KISS (Keep It Simple, Stupid).

Think about the English language. It has a dictionary and rules for syntax and semantics in your writing. There are thousands of resources to write different kinds of text: narratives, poetry and many others. And each type has its rules. In this analogy, each language (PHP, Python, etcetera) Is the English language. Algorithms and design patterns (this last is under the spotlights nowadays) are the types of text available. Each type of text (algorithms and design patterns) have their application, and we need to know and apply them – at the right time. They’re also the issue of preference: I avoid poetry as well I avoid certain algorithms and patterns.

Tell yourself what is a good text. For me, a good text is direct and well focused on its subject, more joinable facts than abstract ideas. Use simple words, the simplest that can express the required sense.

Today, any source code is written almost in the same way that we talk to someone. Manipulate data now has been freed us from those endless loops on tables that needed to be manually locked and indexed. Applications are now object oriented, need to be made for the web, for your phone, for the Xbox, smartphones, 52″ screens and so on.

The human-computer interaction is now an indispensable discipline, fractal-like by the software engineering. By the way, everything related to software development is fractal-like nowadays: We can complicate it to absurd levels, even when unnecessary. It’s common to hear about “abstraction data layers” in applications that never changed your DBMS in last 10 years (and whose supplier will be there for another 10 years).

The subject of this post is a request: I ask to all people involved in a software project to keep things simple, readable and explicit. Tim Peters (a Python guru) has said that. Take a time to read books about algorithms and design patterns. Unlock your creativity. When you develop for the web, think about accessibility, and about it yet, cross your browser boundaries: In fact, the future of your application is unpredictable, and you never know when it will be used inside a big screen or in a cellphone. Use javascript very carefully (always in a suspicious way, and if you can, avoid it).

Cheers!