Search

Todd Rodzen

Agile Application Development

Building a public Amazon AMI

A frustrating issue with building a public Amazon AMI is the authorize key that you use to build and modify the instance must be removed (which removes your own access to the instance.) The private key must be removed before it is shared.

aws-logo-01

It’s like the old problem, which comes first the chicken or the egg. So you remove the key but now you can’t login to your own system. You can only rebuild an EC2 from the ami image. Beyond that, you only do an rm to delete the file but the block of key data is still there in the EBS disk image. Someone could easily unpack the block and undelete the file to restore the authorize key file, connect to your private instances and run up your AWS bill or worse.

What’s the solution? Using additional EBS connections to create an image. Here is the procedure:

  1. Create a new 1gb EBS volume, attach, and mount it on the running instance, say under /keys Use the Amazon EBS guide to format and attach the EBS volume
  2. Copy your authorized_keys to the /keys on the new EBS
  3. Delete all sensitive files and all authorized_keys (from the primary EBS) Also delete the bash.history file and any other logs or passwords.
     sudo chmod 660 /root/.bash_history

     

  4. Exit Putty terminal windows and using Filezilla save empty history files to /root/.bash_history and /home/ec2-user/.bash_history
  5. Delete /tmp files
  6. Do not snapshot the live EBS volume as it still contains the deleted files and you don’t want to make them public in the new AMI. Instead,
  7. Create a new EBS volume, attach, and mount it on the running instance, say under /ebsimage
  8. Copy the root file system over to the new EBS volume. This only copies the current view of the undeleted files and does not copy the blocks containing the deleted files or any other modified file information. The command might look something like:
    rsync -axvSHAX --exclude 'ebsimage' / /ebsimage/
    
  9. Copy you authorize_keys back to your primary EBS
  10. unmount and detach the new EBS volume.
  11. Create an EBS snapshot of the new EBS volume.
  12. Register the EBS snapshot as a new AMI.
Advertisements

Lets Encrypt

The following is a re-post excerpt from Brennen Bearnes at https://www.digitalocean.com/community/tutorials/how-to-set-up-a-node-js-application-for-production-on-ubuntu-16-04 with great thanks!

Install Let’s Encrypt and Dependencies

Let’s Encrypt is a new Certificate Authority that provides an easy way to obtain free TLS/SSL certificates.

You must own or control the registered domain name that you wish to use the certificate with. If you do not already have a registered domain name, you may register one with one of the many domain name registrars out there (e.g. Namecheap, GoDaddy, etc.).

If you haven’t already, be sure to create an A Record that points your domain to the public IP address of your server. This is required because of how Let’s Encrypt validates that you own the domain it is issuing a certificate for. For example, if you want to obtain a certificate for example.com, that domain must resolve to your server for the validation process to work.

For more detail on this process, see How To Set Up a Host Name with DigitalOcean and How To Point to DigitalOcean Nameservers from Common Domain Registrars.

Although the Let’s Encrypt project has renamed their client to certbot, the name of the package in the Ubuntu 16.04 repositories is simply letsencrypt. This package will be completely adequate for our needs.

To install the package, type:

  • sudo apt-get install letsencrypt

The letsencrypt client should now ready to use on your server.

Retrieve Initial Certificate

Since nginx is already running on port 80, and the Let’s Encrypt client needs this port in order to verify ownership of your domain, stop nginx temporarily:

  • sudo systemctl stop nginx

Run letsencrypt with the Standalone plugin:

  • sudo letsencrypt certonly –standalone

You’ll be prompted to answer several questions, including your email address, agreement to a Terms of Service, and the domain name(s) for the certificate. Once finished, you’ll receive notes much like the following:

IMPORTANT NOTES:
 - Congratulations! Your certificate and chain have been saved at
   /etc/letsencrypt/live/your_domain_name/fullchain.pem. Your cert will expire
   on 2016-08-10. To obtain a new version of the certificate in the
   future, simply run Let's Encrypt again.
 - If you like Let's Encrypt, please consider supporting our work by:

   Donating to ISRG / Let's Encrypt:   https://letsencrypt.org/donate
   Donating to EFF:                    https://eff.org/donate-le

Note the path and expiration date of your certificate, highlighted in the example output. Your certificate files should now be available in /etc/letsencrypt/your_domain_name/.

Configure Nginx for HTTPS

You’ll need to add some details to your Nginx configuration. Open /etc/nginx/sites-enabled/defaultin nano (or your editor of choice):

  • sudo nano /etc/nginx/sites-enabled/default

Replace its contents with the following:

/etc/nginx/sites-enabled/default
# HTTP - redirect all requests to HTTPS:
server {
        listen 80;
        listen [::]:80 default_server ipv6only=on;
        return 301 https://$host$request_uri;
}

# HTTPS - proxy requests on to local Node.js app:
server {
        listen 443;
        server_name your_domain_name;

        ssl on;
        # Use certificate and key provided by Let's Encrypt:
        ssl_certificate /etc/letsencrypt/live/your_domain_name/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/your_domain_name/privkey.pem;
        ssl_session_timeout 5m;
        ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
        ssl_prefer_server_ciphers on;
        ssl_ciphers 'EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH';

        # Pass requests for / to localhost:8080:
        location / {
                proxy_set_header X-Real-IP $remote_addr;
                proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
                proxy_set_header X-NginX-Proxy true;
                proxy_pass http://localhost:8080/;
                proxy_ssl_session_reuse off;
                proxy_set_header Host $http_host;
                proxy_cache_bypass $http_upgrade;
                proxy_redirect off;
        }
}

Exit the editor and save the file.

Check the configuration for syntax errors by typing:

  • sudo nginx -t

When no errors are detected, start Nginx again:

  • sudo systemctl start nginx

You can test your new certificate and Nginx configuration by visiting http://your_domain_name/ in your browser. You should be redirected to https://your_domain_name/, without any security errors, and see the “Hello World” printed by your Node.js app.

Set Up Let’s Encrypt Auto Renewal

Warning: You can safely complete this guide without worrying about certificate renewal, but you will need to address it for any long-lived production environment.

You may have noticed that your Let’s Encrypt certificate is due to expire in 90 days. This is a deliberate feature of the Let’s Encrypt approach, intended to minimize the amount of time that a compromised certificate can exist in the wild if something goes wrong.

The Let’s Encrypt client can automatically renew your certificate, but in the meanwhile you will either have to repeat the certificate retrieval process by hand, or use a scheduled script to handle it for you. The details of automating this process are covered in How To Secure Nginx with Let’s Encrypt on Ubuntu 16.04, particularly the section on setting up auto renewal.

Reverse proxy on node.js

A reverse proxy is an important part of the puzzle of a production application. The process is to create a reverse proxy Nginx server that interacts with the world and dishes out the requests from the user to a farm of back-end application node.js or Apache servers. The actual backend application server can be secured to only communicate with the reverse proxy server, therefore limiting its vulnerability to attacks.

The good thing about the agile application design is you don’t have to modify your code for reverse proxy except to understand different processes may want to be broken up to different servers or a farm of servers. Therefore creating small single use back-end applications is preferred over a single larger more complex back end server design that does everything in one process. For example serving email and user logins are certainly better designed by different application processes.

Another benefit of running a Nginx reverse proxy is the single reverse proxy can server applications and website from both Apache servers and node.js servers, therefore, mydomain.com might be served by the Apache server while mydomain.com/app might be served by the node.js server.

A Nginx based reverse proxy server is installed with the following:

sudo yum install nginx
sudo chmod 664 /etc/nginx/nginx.conf

Then use Filezilla to add the following lines to the location / {} directives in the /etc/nginx/nginx.conf file

location / {
 proxy_pass http://localhost:8080;
 proxy_http_version 1.1;
 proxy_set_header Upgrade $http_upgrade;
 proxy_set_header Connection 'upgrade';
 proxy_set_header Host $host;
 proxy_cache_bypass $http_upgrade;
 }

location /node {
 proxy_pass http://localhost:4200;
 proxy_http_version 1.1;
 proxy_set_header Upgrade $http_upgrade;
 proxy_set_header Connection 'upgrade';
 proxy_set_header Host $host;
 proxy_cache_bypass $http_upgrade;
 }
Restart the Nginx reverse proxy server
sudo service nginx restart

Add auto start to the nginx service with

chkconfig nginx on

I prefer to start Nginx as a reverse proxy on port 80 and change the default root of httpd.conf to 8080 Therefore unless it’s a specifically defined location route it will default proxy through Nginx to the apache server.

That’s All.

to Cluster or not to Cluster

This blog will review cluster environment setup to handle node multi-threaded service instances. In our last post, we reviewed the methods of using pooled connections for MySQL connector in node.js on a Linux server running on an Amazon AWS EC2 instance. The conclusion was that it is necessary from the beginning of the application development process to write javascript back-end application server code that uses MySQL pooling methods. This is a significant change in the application code from the non-pooled connection method API. It is important to understand the techniques you intend to use before setting down to write your production code for a back-end application server because it’s important to write your application once and do it right the first time.

Now let’s take a look at clusters. This is a built-in part of the node.js methods. (We are not talking about the clusters add-on module from npmjs.com that goes by the same name.) Using the built in clusters methods for node.js is described in multiple blog posts here and here. The documentation and posts describe setting up a cluster.js that runs a master and fork process servers that run your app.js code multiple times utilizing multiple core processor threads. This increases throughput and creates a multi-threaded environment. The simplified node.js cluster environment code is shown here.

var cluster = require('cluster');
var numCPUs = require('os').cpus().length;

if (cluster.isMaster) {

//  for (var i = 0; i < numCPUs; i++){
  for (var i = 0; i < 10; i++){
    cluster.fork();
  }

  cluster.on('exit',function(worker, code, signal) {
    console.log('worker ' + worker.process.pid + ' died');
  });
} else {

    //change this line to Your Node.js app entry point.
    require("./app.js");
}

In simplified terms, the code above creates multiple forks using a Master node cluster to run multiple instances (multi-threaded) of your node.js application. We did some testing on our Amazon AWS EC2 t2.micro (free tier) and using our async-test.js test, from our prior blog post, using an external RDS MySQL database with pooled MySQL connections. We used Siege of up to 500 concurrent connections and hit it constantly for 1 minute resulting in thousands of hits. What we found is a single process environment resulted in database connection errors while a cluster run environment of 10 processes (multi-threaded) running the same async-test resulted in no connection errors. This proves a cluster multi-threaded environment is important.

When you look at my code above you can see I altered the code to force 10 process forks (instead of using numCPUs.) Remember I am running this on a virtual shared server using Amazon AWS. When using the os.cpus ().length the os method reports back 1 core process thread.  This is the number of process threads available as designed by AWS. But a little digging into the full array results of the os.cpus () method reveals my AWS t2.micro is actually running on a XEON E5 2670 2.5ghz processor with 10 cores and 20 threads. If you did this on a single dedicated machine running that processor you would get a result of 20 for the os.cpus ().length method. Actually, Amazon does a lot of behind the scenes stuff to throttle the processes. But getting more than what you pay for is not at issue here. (Even though you may be getting it all for free through the free tier.) The issue is a working production application design that doesn’t fail at critical points like peak database traffic. So what we found is the processor even on the AWS t2.micro (free tier) was able to better handle the traffic on multiple clustered processes and it could handle a huge amount of traffic. One seige test resulted in no connection errors of over 8000 hits in 30 seconds. Since it is a shared server it is doubtful 20 threads would be useful. It’s interesting to note Amazon actually defines the t2.micro with 1 vCPU and 6 CPU credits/hour which is a throttle mechanism and different from actual core processing threads. Amazon uses processor credits to ensure you get what you pay for or to throttle your application as needed. As in life, you only get what you pay for! 🙂

But do we write code and change our application design to handle the cluster methods. No! We don’t change our code to use the cluster methods. There is a better way and it’s all handled for us using the PM2 module from npmjs.com. No need to create code like the cluster.js sample above. This multi-threading cluster environment, base node system control, monitoring, and performance optimization is likely a place where you don’t need to reinvent the wheel, there is already significant well-designed products to handle these functions. To start, just install the PM2 module with

npm install -g pm2

A useful additional tool in connection wth PM2 is the https://app.keymetrics.io dashboard monitor. You can go there and create a bucket and server connection to your PM2 server to generate metrics data and external monitoring and control. Use the PM2 documentation for all the PM2 commands but some of the useful commands are

Pm2 start app.js

Pm2 stop all

Pm2 start app.js -i 4  # to start 4 cluster instances of your app.js

Pm2 list  # to list all running process threads

There is another module called forever but we found PM2 is much more advanced, robust, and well supported.

One final item to do is setup PM2 to run on startup. Do the following command which creates a line item to copy and paste to your terminal window. The one line of code it creates will automatically start PM2 on reboot.

pm2 startup

What we do need to do in our application design is write good strong well-designed code that handles multiple instances running and save session or environment variables that can be accessed by all cluster process instances in a multi-threaded environment. In a later post we will cover using Redis as a global external store for process variables. As well codeforgeek provides a great tutorial. This is the real important part of application development. The two most important parts that must be designed to handle multi-threaded clusters and multi-instances are session like variables and database connection transactions. For example, three related SQL inserts or Redis store SETS must complete before another process tries to select that same and related set of data.

In conclusion, I recommend installing pm2 and use it from the start of the agile application development process. An added benefit of using pm2 in development is added logging and debugging methods.

to Pool or not to Pool

Using Node.js with MySQL Module I have done some testing. Under a real word situation stress on the server from multiple connections could result in a database connection failure. The answer is pooled connections. Here is my test code.

Non-pooled Connection

// test-sync.js
var express = require('express')
var app = express()

app.get('/test-sync', function (req, res) {
// console.log('1. Received Get')
 res.send('Hello World!')

var mysql = require('mysql');
var connection = mysql.createConnection({
 host : 'localhost',
 user : 'mazu',
 password : '',
 database : 'mazudb'
});
connection.connect(function(err){
if(!err) {
// console.log("2. Database Connected"); 
} else {
 console.log("Error connecting database ... nn"); 
}
});

sql="SELECT * FROM `test-mysql-Customers` LIMIT 2"
connection.query(sql, function(err, rows, fields) {
connection.end();
 if (!err) {
// console.log("3. SQL Completed");
} else
 console.log('Error while performing Query.');
 });
});

app.listen(4200, function () {
 console.log('Example app listening on port 4200!')
})

Pooled Connection

var express = require("express");
var mysql = require('mysql');
var app = express();

var pool = mysql.createPool({
 connectionLimit : 100, //important
 host : 'localhost',
 user : 'mazu',
 password : '',
 database : 'mazudb',
 debug : false
});

function handle_database(req,res) {
 
 pool.getConnection(function(err,connection){
 if (err) {
 res.json({"code" : 100, "status" : "Error in connection database"});
 var tlog = json({"code" : 100, "status" : "Error in connection database"});
 console.log(tlog);
 return;
 }

connection.setMaxListeners(0)

// console.log('connected as id ' + connection.threadId + ' connection.getMaxListeners()' + connection.getMaxListeners());
 
 sql="SELECT * FROM `test-mysql-Customers` LIMIT 2"

connection.query(sql,function(err,rows){
 connection.release();
 if(!err) {
 res.json(rows);
 } 
 });

connection.on('error', function(err) { 
 var tlog = json({"code" : 100, "status" : "Error in connection database"});
 console.log(tlog);
 res.json({"code" : 200, "status" : "Error in connection database"});
 return; 
 });
 });
}

app.get("/test-async",function(req,res){-
 handle_database(req,res);
});

app.listen(4200);

The primary difference between the two methods is the first one does a synchronous creatConnection, connect, query, and end; while the second pooled method does a createPool connection that creates a queue for the queries. When a get request is received it then does a getConnection that uses one process from the pooled process queue, it does its query, and finally a release of that one process in the pooled queue.

A stress test on these methods results in similar throughput. Essentially the same number of transactions can get through but with non-pooled connections, the likelihood of database connection errors is higher. I used a siege tool stress test with the following

siege -c200 -t60s -d3 http://localhost:4200/test-sync

It resulted in about 8000 hits but on the synchronous non-pooled method it resulted in about 16 database connection error, while the pooled method resulted in no connection errors. This test was with an Amazon EC t2.micro with Linux and an external RDS MySQL database. Obviously, database connection errors are bad! This proves a pooled connection is the way to go.

Express npm module

The Express module available from http://npmjs.com is a common tool to quickly build applications and can be used for back end node.js APIs. Let’s get started with Express on an Amazon Linux EC2 node.js server, do the following commands

mkdir -p -v /node/async-test && cd $_
npm init  # and answer the questions

npm install mysql --save
npm install express --save

Create the helloworld.js file in /node/async-test directory with the following Express module get and listen commands:

var express = require('express')
var app = express()

app.get('/helloworld', function (req, res) {
 res.send('Hello World!')
})

app.listen(4200, function () {
 console.log('Example app listening on port 4200!')
})

Run the node.js application with:

node helloworld.js

Now test at a web browser with:

http://myhost.amazonaws.com:4200/helloworld

The result on the web browser is Hello World! Go to http://expressjs.com/ for complete documentation on the Express module.

Linux Siege package

This describes the steps to install Siege on the Amazon AWS EC2 Linux server.

We will be using a tool called Siege to test the server with 100’s or 1000’s of simultaneous calls. Let’s first turn on the Extra Packages for Enterprise Linux (EPEL) repository from the Fedora project. By default, this repository is present in /etc/yum.repos.d on Amazon Linux instances, but it is not enabled.

sudo yum-config-manager --enable epel

Note: For information on enabling the EPEL repository on other distributions, such as Red Hat and CentOS, see the EPEL documentation at https://fedoraproject.org/wiki/EPEL.

Now install Siege with the following command and answer Y to the two prompts.

sudo yum install siege

Now you can lay Siege to a server with the following command

siege -c25 -t20s -d3 http://localhost:4200

-c25 = 25 simultaneous users
-t20s = do it continuously for 20 seconds
-d3 (default) = a random time delay for each request of 1 to 3 seconds.
url request = http://localhost:4200

The full documentation on Siege is available at https://www.joedog.org/siege-manual/

 

node.js to MySQL Connector

Here we will setup the node.js server to connect to a MySQL database.

Using a Putty / SSH terminal window on the Linux node.js server first go to your working directory for your node-js application then install the mysql module using npm as shown below. Details of the mysql package we are using are found at https://www.npmjs.com/package/mysql

It is important to note npm install just installs the package to a directory called /node_modules/ within your current working directory. Since we don’t want a package that may be modified or removed to adversely affect our application we will install the package directly in the /node application working directory.

cd /node
npm install mysql

Then using Sublime or text editor create a test-mysql.js file with the below code. Next using Filezilla or a SSH FTP client upload the file to the /vnode directory.

const readline = require('readline');
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout
});
rl.question('mysql host name? ', (hostanswer) => {
rl.question('mysql user name? ', (useranswer) => {
rl.question('mysql password? ', (passwordanswer) => {
rl.question('mysql database? ', (dbanswer) => {
rl.close();


var mysql = require('mysql');
var connection = mysql.createConnection({
host : hostanswer,
user : useranswer,
password : passwordanswer,
database : dbanswer,
multipleStatements: true
});

connection.connect();

var sql="DROP TABLE IF EXISTS `test-mysql-Customers`; CREATE TABLE `test-mysql-Customers` (`CompanyName` varchar(45) NOT NULL,`City` varchar(45) DEFAULT NULL, `Country` varchar(45) DEFAULT NULL,PRIMARY KEY (`CompanyName`)); INSERT INTO `test-mysql-Customers` VALUES ('Company1','City1','Country1'),('Company2','City2','Country2'); SELECT * from `test-mysql-Customers`";

connection.query(sql, function(err, rows, fields) {
if (!err) {
console.log('The mysql return data: ', rows);
console.log('The mysql test completed successfully. test-mysql-Customers table was created.');
} else
console.log('Error while performing Query.');
});

connection.end();


});
});
});
});

Next go to the terminal window still in the /node working directory and run the test-mysql.js server application. When prompted enter a valid mysql server. In a previous post I created a mysql database on an Amazon RDS server. An RDS or any valid mysql server will work. The test simply creates a database table on the server called test-mysql-Customer with 3 fields and 2 records.

node test-mysql.js

The results will be the returned SQL string data and the final line will have:

The mysql test completed successfully. test-mysql-Customers table was created.

That’s all folks!

Next we will explore creating an asynchronous real word server.

Why node

One of greatest advantages of node is a block-less environment.  Better described by the node.js wiki:

Node.js is primarily used to build network programs such as Web servers.[43] The biggest difference between Node.js and PHP is that most functions in PHP block until completion (commands execute only after previous commands have completed), while functions in Node.js are designed to be non-blocking (commands execute in parallel, and use callbacks to signal completion or failure).[43]

This non-blocking feature is ideal for real-time updates like chat rooms and social media.

Powered by WordPress.com.

Up ↑