Install MySQL on CentOS

How to install MySQL

Install MySQL
yum install mysql-server mysql php-mysql

How to configure MySQL

Set the MySQL service to start on boot
chkconfig –levels 235 mysqld on
Start the MySQL service
service mysqld start
Log into MySQL
mysql -u root
Set the root user password for all local domains
SET PASSWORD FOR ‘root’@'localhost’ = PASSWORD(‘new-password’);
SET PASSWORD FOR ‘root’@'localhost.localdomain’ = PASSWORD(‘new-password’);
SET PASSWORD FOR ‘root’@’127.0.0.1′ = PASSWORD(‘new-password’);
Drop the Any user
DROP USER ”@’localhost’;
DROP USER ”@’localhost.localdomain’;
Exit MySQL
exit

Install Scala on CentOS

cd /tmp
wget http://www.scala-lang.org/files/archive/scala-2.11.2.rpm
sudo rpm -i scala-2.11.2.rpm


Alternative:
Scala files are now here http://www.scala-lang.org/files/archive/
wget http://www.scala-lang.org/files/archive/scala-2.10.1.tgz
tar xvf scala-2.10.1.tgz
sudo mv scala-2.10.1 /usr/lib
sudo ln -s /usr/lib/scala-2.10.1 /usr/lib/scala
export PATH=$PATH:/usr/lib/scala/bin
scala -version

Setting up python env

Centos:
yum install python2.7
yum install lapack lapack-devel blas blas-devel

pip install virtualenv
cd ~/src/yourprojectdir
virtualenv -p `which python2.7` venv –distribute
source venv/bin/activate
vi requirements.txt
..add ipython, scipy and other dependencies

pip install -r requirements.txt

ipython

..enjoy

Install VirtualBox on Centos 6.5

I used this instruction: http://www.if-not-true-then-false.com/2010/install-virtualbox-with-yum-on-fedora-centos-red-hat-rhel/

1. Change to root User
su -
2. Install Fedora or RHEL Repo Files
cd /etc/yum.repos.d/
wget http://download.virtualbox.org/virtualbox/rpm/rhel/virtualbox.repo
3. Update latest packages and check your kernel version
Update packages
yum update
Check that that you are running latest installed kernel version
Output of following commands version numbers should match:
rpm -qa kernel |sort -V |tail -n 1
uname -r
Note: If you got kernel update or run older kernel than newest installed then reboot:
reboot
4. Install following dependency packages
CentOS 6/5 and Red Hat (RHEL) 6/5 needs EPEL repository, install it with following command:
## CentOS 6 and RHEL 6 ##
rpm -Uvh http://dl.fedoraproject.org/pub/epel/6/i386/epel-release-6-8.noarch.rpm

5. Install VirtualBox Latest Version 4.3 (currently 4.3.10)
yum install VirtualBox-4.3
Note:
This command create automatically vboxusers group and VirtualBox user must be member of that group.
This command also build needed kernel modules.
Rebuild kernel modules with following command:
service vboxdrv setup
6. Add VirtualBox User(s) to vboxusers Group
Replace user_name with your own user name or some another real user name.
usermod -a -G vboxusers user_name
7. Start VirtualBox
- Start from GUI Applications -> … -> VM

Then I downloaded ready to use VMImage:

http://www.modern.ie/en-us/virtualization-tools#downloads

It is valid for 30 days only. So I plan to buy licensed Win 8.1 if it’s stable and performance is ok on 8GB RAM. If not I am going to downgrade to 7 or Vista.

Dimensionality Reduction with PCA algorithm

There are a lot of good articles that describe theory of dimensionality reduction with various algorithms such as PCA. Some of them have really good examples (for instance this one: http://blog.yhathq.com/posts/image-classification-in-Python.html)
However in order to apply and use it I want to develop intuition: what does it mean from a mathematical/machine standpoint to reduce 132342 dimensional space into let’s say 2D. After several hours of playing around with sklearn PCA implementation I’ve come up with following representation that shows 1st component of 2 dimensional space:

This is how a machine sees the data. On the left input non transformed 2 input data samples. On the right data samples projected to 2D and represented back into 132342D space for 1st component. Simply 1st element of array of 2 elements multiplied by 1st column of so called U matrix with 132342D elements in it.

As you see after data point is projected into 2D there is clear separation between different data point types, that can be used for further logistic regression algorithm.

Data Science course

101 course practical course on Big Data from Harvard by Hanspeter Pfister and Joe Blitzstein. I highly recommend it for juniors and mids. This course together with ML by Andrew Ng and AI by Sebastian Thurn and Peter Norvig will create solid base on further Intelligent Big Data processing.
Course home page http://cs109.org

HW0 – Setup environment: http://nbviewer.ipython.org/github/cs109/content/blob/master/HW0.ipynb
- Python Environment: https://github.com/cs109/content/wiki/Installing-Python

Play 2.0 on Heroku

Несколько простых шагов и ваш веб апликейшн созданный на базе Play 2.0 фреймворка в клауде Heroku.

Install Play2.0 framework
Sign up to Heroku and install toolbelt
Create Public/Private keys
Create new project play new
Create Procfile in root dir of your project with one single line (web: target/start -Dhttp.port=${PORT} ${JAVA_OPTS})
Commit to Git (git init, git add ., git commit -m “init”)
Login to Heroku
Create Cedar stack (heroku create –stack cedar)
Push changes to heroku (git push heroku master)
Scale(start) your first web dynos (heroku ps:scale web=1, heroku restart)

Resources:
- https://devcenter.heroku.com/articles/quickstart
- https://github.com/playframework/Play20/wiki/ProductionHeroku