Android+TensorFlow+CNN+MNIST 手写数字识别实现

Catalogue

  1. 1. Overview
  2. 2. Practice
    1. 2.1. Environment
    2. 2.2. Train & Evaluate(Python+TensorFlow)
    3. 2.3. Test(Android+TensorFlow)
  3. 3. Theory
    1. 3.1. MNIST
    2. 3.2. CNN(Convolutional Neural Network)
      1. 3.2.1. CNN Keys
      2. 3.2.2. CNN Architecture
    3. 3.3. Regression + Softmax
      1. 3.3.1. Linear Regression
      2. 3.3.2. Softmax Regression
  4. 4. References & Recommends

Overview

本文系“SkySeraph AI 实践到理论系列”第一篇,咱以AI界的HelloWord 经典MNIST数据集为基础,在Android平台,基于TensorFlow,实现CNN的手写数字识别。
Code~


Practice

Environment

  • TensorFlow: 1.2.0
  • Python: 3.6
  • Python IDE: PyCharm 2017.2
  • Android IDE: Android Studio 3.0

Train & Evaluate(Python+TensorFlow)

训练和评估部分主要目的是生成用于测试用的pb文件,其保存了利用TensorFlow python API构建训练后的网络拓扑结构和参数信息,实现方式有很多种,除了cnn外还可以使用rnn,fcnn等。
其中基于cnn的函数也有两套,分别为tf.layers.conv2d和tf.nn.conv2d, tf.layers.conv2d使用tf.nn.conv2d作为后端处理,参数上filters是整数,filter是4维张量。原型如下:
convolutional.py文件
def conv2d(inputs, filters, kernel_size, strides=(1, 1), padding=’valid’, data_format=’channels_last’,
dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=None,
bias_initializer=init_ops.zeros_initializer(), kernel_regularizer=None, bias_regularizer=None,
activity_regularizer=None, kernel_constraint=None, bias_constraint=None, trainable=True, name=None,
reuse=None)

gen_nn_ops.py 文件

def conv2d(input, filter, strides, padding, use_cudnn_on_gpu=True, data_format="NHWC", name=None)

官方Demo实例中使用的是layers module,结构如下:

  • Convolutional Layer #1:32个5×5的filter,使用ReLU激活函数
  • Pooling Layer #1:2×2的filter做max pooling,步长为2
  • Convolutional Layer #2:64个5×5的filter,使用ReLU激活函数
  • Pooling Layer #2:2×2的filter做max pooling,步长为2
  • Dense Layer #1:1024个神经元,使用ReLU激活函数,dropout率0.4 (为了避免过拟合,在训练的时候,40%的神经元会被随机去掉)
  • Dense Layer #2 (Logits Layer):10个神经元,每个神经元对应一个类别(0-9)

核心代码在cnn_model_fn(features, labels, mode)函数中,完成卷积结构的完整定义,核心代码如下.

也可以采用传统的tf.nn.conv2d函数, 核心代码如下。

Test(Android+TensorFlow)

  • 核心是使用API接口: TensorFlowInferenceInterface.java
  • 配置gradle 或者 自编译TensorFlow源码导入jar和so
    compile ‘org.tensorflow:tensorflow-android:1.2.0’
  • 导入pb文件.pb文件放assets目录,然后读取

    String actualFilename = labelFilename.split(“file:///android_asset/“)[1];
    Log.i(TAG, “Reading labels from: “ + actualFilename);
    BufferedReader br = null;
    br = new BufferedReader(new InputStreamReader(assetManager.open(actualFilename)));
    String line;
    while ((line = br.readLine()) != null) {
    c.labels.add(line);
    }
    br.close();

  • TensorFlow接口使用
  • 最终效果:

Theory

MNIST

MNIST,最经典的机器学习模型之一,包含0~9的数字,28*28大小的单色灰度手写数字图片数据库,其中共60,000 training examples和10,000 test examples。
文件目录如下,主要包括4个二进制文件,分别为训练和测试图片及Label。

如下为训练图片的二进制结构,在真实数据前(pixel),有部分描述字段(魔数,图片个数,图片行数和列数),真实数据的存储采用大端规则。
(大端规则,就是数据的高字节保存在低内存地址中,低字节保存在高内存地址中)

在具体实验使用,需要提取真实数据,可采用专门用于处理字节的库struct中的unpack_from方法,核心方法如下:
struct.unpack_from(self._fourBytes2, buf, index)

MNIST作为AI的Hello World入门实例数据,TensorFlow封装对其封装好了函数,可直接使用
mnist = input_data.read_data_sets(‘MNIST’, one_hot=True)

CNN(Convolutional Neural Network)

CNN Keys

  • CNN,Convolutional Neural Network,中文全称卷积神经网络,即所谓的卷积网(ConvNets)。
  • 卷积(Convolution)可谓是现代深度学习中最最重要的概念了,它是一种数学运算,读者可以从下面链接[23]中卷积相关数学机理,包括分别从傅里叶变换和狄拉克δ函数中推到卷积定义,我们可以从字面上宏观粗鲁的理解成将因子翻转相乘卷起来。
  • 卷积动画。演示如下图[26],更多动画演示可参考[27]
  • 神经网络。一个由大量神经元(neurons)组成的系统,如下图所示[21]

    其中x表示输入向量,w为权重,b为偏值bias,f为激活函数。
  • Activation Function 激活函数: 常用的非线性激活函数有Sigmoid、tanh、ReLU等等,公式如下如所示。
    • Sigmoid缺点
      • 函数饱和使梯度消失(神经元在值为 0 或 1 的时候接近饱和,这些区域,梯度几乎为 0)
      • sigmoid 函数不是关于原点中心对称的(无0中心化)
    • tanh: 存在饱和问题,但它的输出是零中心的,因此实际中 tanh 比 sigmoid 更受欢迎。
    • ReLU
      • 优点1:ReLU 对于 SGD 的收敛有巨大的加速作用
      • 优点2:只需要一个阈值就可以得到激活值,而不用去算一大堆复杂的(指数)运算
      • 缺点:需要合理设置学习率(learning rate),防止训练时dead,还可以使用Leaky ReLU/PReLU/Maxout等代替
  • Pooling池化。一般分为平均池化mean pooling和最大池化max pooling,如下图所示[21]为max pooling,除此之外,还有重叠池化(OverlappingPooling)[24],空金字塔池化(Spatial Pyramid Pooling)[25]
    • 平均池化:计算图像区域的平均值作为该区域池化后的值。
    • 最大池化:选图像区域的最大值作为该区域池化后的值。

CNN Architecture

  • 三层神经网络。分别为输入层(Input layer),输出层(Output layer),隐藏层(Hidden layer),如下图所示[21]
  • CNN层级结构。 斯坦福cs231n中阐述了一种[INPUT-CONV-RELU-POOL-FC],如下图所示[21],分别为输入层,卷积层,激励层,池化层,全连接层。
  • CNN通用架构分为如下三层结构:
    • Convolutional layers 卷积层
    • Pooling layers 汇聚层
    • Dense (fully connected) layers 全连接层
  • 动画演示。参考[22]。

Regression + Softmax

机器学习有监督学习(supervised learning)中两大算法分别是分类算法和回归算法,分类算法用于离散型分布预测,回归算法用于连续型分布预测。
回归的目的就是建立一个回归方程用来预测目标值,回归的求解就是求这个回归方程的回归系数。
其中回归(Regression)算法包括Linear Regression,Logistic Regression等, Softmax Regression是其中一种用于解决多分类(multi-class classification)问题的Logistic回归算法的推广,经典实例就是在MNIST手写数字分类上的应用。

Linear Regression

Linear Regression是机器学习中最基础的模型,其目标是用预测结果尽可能地拟合目标label

  • 多元线性回归模型定义
  • 多元线性回归求解
  • Mean Square Error (MSE)
    • Gradient Descent(梯度下降法)
    • Normal Equation(普通最小二乘法)
    • 局部加权线性回归(LocallyWeightedLinearRegression, LWLR ):针对线性回归中模型欠拟合现象,在估计中引入一些偏差以便降低预测的均方误差。
    • 岭回归(ridge regression)和缩减方法
  • 选择: Normal Equation相比Gradient Descent,计算量大(需计算X的转置与逆矩阵),只适用于特征个数小于100000时使用;当特征数量大于100000时使用梯度法。当X不可逆时可替代方法为岭回归算法。LWLR方法增加了计算量,因为它对每个点做预测时都必须使用整个数据集,而不是计算出回归系数得到回归方程后代入计算即可,一般不选择。
  • 调优: 平衡预测偏差和模型方差(高偏差就是欠拟合,高方差就是过拟合)
    • 获取更多的训练样本 – 解决高方差
    • 尝试使用更少的特征的集合 – 解决高方差
    • 尝试获得其他特征 – 解决高偏差
    • 尝试添加多项组合特征 – 解决高偏差
    • 尝试减小 λ – 解决高偏差
    • 尝试增加 λ -解决高方差

Softmax Regression

  • Softmax Regression估值函数(hypothesis)
  • Softmax Regression代价函数(cost function)
  • 理解:
  • Softmax Regression & Logistic Regression:
    • 多分类 & 二分类。Logistic Regression为K=2时的Softmax Regression
    • 针对K类问题,当类别之间互斥时可采用Softmax Regression,当非斥时,可采用K个独立的Logistic Regression
  • 总结: Softmax Regression适用于类别数量大于2的分类,本例中用于判断每张图属于每个数字的概率。

References & Recommends

MNIST

Softmax

CNN

TensorFlow+CNN / TensorFlow+Android



By SkySeraph-2018

SkySeraph cnBlogs
SkySeraph CSDN

本文首发于skyseraph.com“Android+TensorFlow+CNN+MNIST 手写数字识别实现”

wpf开源项目整理

FrameWork:

Prism – Application framework which provides an implementation of a collection of design patterns (MVVM, EventAggregator, …) that are helpful in writing well structured and maintainable applications

UI:

Dragablz – Dragable and tearable tab control for WPF

MahApps.Metro – “Metro” or “Modern UI” for WPF applications

MaterialDesignInXamlToolkit – Material Design templates and styles for WPF

MaterialSkin  -Theming .NET WinForms, C# , to Google’s Material Design Principles.

Plot:

OxyPlot – Plotting library for .NET

Live-Charts -  Simple, flexible, interactive & powerful charts, maps and gauges for .Net

Common:

Newtonsoft.Json – JSON framework for .NET

WpfLocalizeExtension – Library for the localization

Sample:

WPF-Samples -Repository for WPF related samples

PrismMahAppsSample -Modular application sample based on the PRISM-Library and MahApps.Metro as UI

Prism-Samples-Windows -Samples that demonstrate how to use various Prism features

GearedExamples -A set of examples for the LiveCharts.Geared package

Projects:

MetroFtpClient  - FTP-Client (MahApps.Metro, OxyPlot, Prism)

BaiduPanDownloadWinform  -百度网盘不限速下载工具 (MahApps.Metro, Prism)

Others:

WPF 杂谈——开篇简言

How to install redis server on CentOS

In this tutorial we will learn, how to install redis server on CentOS 7 / RHEL 7 . The abbreviation of redis is REmote DIctionary Server. It is one the of the most popular open source,advanced key-value cache and store.

Project URL : http://redis.io/

Follow the given below steps to install redis server on CentOS 7 and Red Hat Enterprise Linux 7.

Install wget utility

Install wget command

yum install wget

Install EPEL repo

First we will install the EPEL repo. For more detail on EPEL repo, we suggest you to read our this post.

Because our system has x86_64 Operating System architecture, we will use only epel repo package for x86_64 . Search epel repo package as per your Operating System architecture(EPEL URL)

wget -r --no-parent -A 'epel-release-*.rpm' http://dl.fedoraproject.org/pub/epel/7/x86_64/e/

rpm -Uvh dl.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-*.rpm

It will create two epel’s repo file inside /etc/yum.repos.d
These are –
1. epel.repo
2.epel-testing.repo

[root@localhost ~]# ls -l /etc/yum.repos.d/
total 28
-rw-r--r--. 1 root root 1612 Jul  4 07:00 CentOS-Base.repo
-rw-r--r--. 1 root root  640 Jul  4 07:00 CentOS-Debuginfo.repo
-rw-r--r--. 1 root root 1331 Jul  4 07:00 CentOS-Sources.repo
-rw-r--r--. 1 root root  156 Jul  4 07:00 CentOS-Vault.repo
-rw-r--r--. 1 root root  957 Sep  2 12:14 epel.repo
-rw-r--r--. 1 root root 1056 Sep  2 12:14 epel-testing.repo
[root@localhost ~]#

Install redis server

Now use yum command to install redis server

yum install redis

Two important redis server configuration file’s path
1. /etc/redis.conf
2. /etc/redis-sentinel.conf

Now start the redis server after this.

systemctl start redis.service

Check the running status of redis server

systemctl status redis.service

To test the installation of Redis, use below given command

redis-cli ping

If the response output is PONG, it means installation is completed successfully.

[root@localhost ~]# redis-cli ping
PONG
[root@localhost ~]#

Start/Stop/Restart/Status and Enable redis server

To start redis server

systemctl start redis.service

To stop redis server

systemctl stop redis.service

To restart redis server

systemctl restart redis.service

To get running status of redis server

systemctl status redis.service

To enable redis server at system’s booting time.

systemctl enable redis.service

To disable redis server at system’s booting time.

systemctl disable redis.service

Listening Port Of Redis Server

Redis Server listens by default at port number 6379. Use below given ss command. (To learn more about ss command)

[root@localhost ~]# ss -nlp|grep redis
tcp    LISTEN     0      128            127.0.0.1:6379                  *:*      users:(("redis-server",19706,4))
[root@localhost ~]#

Note: On minimal installed CentOS 7/ RHEL 7,you wont get netstat command. Instead of netstat command, use ss command which is by default available on system.

Learn Redis : http://redis.io/documentation

Who is using redis: Who is using Redis

Redis配置文件参数说明

from:http://sharadchhetri.com/2014/10/04/install-redis-server-centos-7-rhel-7/

How to Install Apache Tomcat on CentOS

Apache Tomcat is an open source Java Servlet implementation developed by the Apache Software Foundation. Beside Java Servlets, Tomcat implements several other Java server technologies including  JavaServer Pages (JSP), Java Expression Language, and Java WebSocket. Tomcat provides an HTTP Web Server for Java applications with support for HTTP/2, OpenSSL for JSSE and TLS virtual hosting.

In this tutorial, I will show you how to install and configure Apache Tomcat 8.5 on a CentOS 7 server and how to install and configure Java on a CentOS server which is one of the prerequisites for Tomcat.

Prerequisites

  • Server with CentOS 7 – 64bit
  • 2 GB or more RAM (Recommended)
  • Root Privileges on the server

Step 1 – Install Java (JRE and JDK)

In this step, we will install the Java JRE and JDK from the CentOS repository. We will install Java 1.8.11 on the server with the yum command.

Run this command to install Java JRE and JDK from CentOS repository with yum:

yum -y install java-1.8.0-openjdk.x86_64 java-1.8.0-openjdk-devel.x86_64

It will take some time, wait until the installation finished.

Then you should check the Java version with the command below:

java -version

You should see results similar to the ones below:

openjdk version “1.8.0_111″
OpenJDK Runtime Environment (build 1.8.0_111-b15)
OpenJDK 64-Bit Server VM (build 25.111-b15, mixed mode)

Check the Java version

Step 2 – Configure the Java Home Environment

In the first step, we’ve installed Java. Now we need to configure the JAVA_HOME environment variable on the CentOS server so that Java applications can find the right Java version and Tomcat requires the JAVA_HOME environment to be setup properly, so we need to configure it.

Before we configure the JAVA_HOME environment, we need to know where the Java directory is. Check the Java directory with the command below:

sudo update-alternatives –config java

Java directory = “/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/jre

Then edit the environment file with vim:

vim /etc/environment

Add the JAVA_HOME environment variable by adding the configuration below:

JAVA_HOME=”/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/jre”

Save the /etc/environment file and exit vim.

Next, edit the .bash_profile file and add the JAVA_HOME variable as well:

vim ~/.bash_profile

At the end of the file, paste the configuration below:

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.111-1.b15.el7_2.x86_64/jre
export PATH=$JAVA_HOME/bin:$PATH

Save the file, then reload the bash_profile file.

source ~/.bash_profile

Make sure there is no error, Finally check the JAVA_HOME environment variable:

echo $JAVA_HOME

You will see Java path directory.

Setup the Java home environment variable

Step 3 – Install Apache Tomcat 8.5

In this step, we will install Apache Tomcat under the user tomcat (which we have to create first).

Create a user and group named tomcat:

groupadd tomcat
useradd -s /bin/false -g tomcat -d /opt/tomcat tomcat

Note:
-s /bin/false = disable shell access
-g tomcat = assign new user to the group tomcat
-d /opt/tomcat = define the home directory for the user

Next, go to the /opt directory and download tomcat with the wget command:

cd /opt/
wget http://mirror.wanxp.id/apache/tomcat/tomcat-8/v8.5.6/bin/apache-tomcat-8.5.6.tar.gz

Extract Tomcat and move all the files and directories that are in the ‘apache-tomcat-8.5.6′ directory to the ‘tomcat’ directory.

tar -xzvf apache-tomcat-8.5.6.tar.gz
mv apache-tomcat-8.5.6/* tomcat/

Now change the owner of the tomcat directory to the tomcat user and group.

chown -hR tomcat:tomcat tomcat

Step 4 – Test Apache Tomcat

In step 3, we installed and configure tomcat. In this step, we just want to run a short test to make sure there are no errors.

Go to the tomcat/bin directory and run the command ‘startup.sh’ to test Apache Tomcat:

cd /opt/tomcat/bin/
./startup.sh

Make sure the result is ‘Tomcat started’.

Tomcat is using port 8080 now, check the open port on the server with the netstat command.

netstat -plntu

Check that Tomcat has been started with netstat

Or visit the server IP address with port 8080 – in my case 192.168.1.120:8080 – with a web browser. You will see the Apache Tomcat default page.

Test Apache Tomcat with a Browser

Next, stop Apache Tomcat and because we will run it Tomcat with a systemd service file in the final configuration. Make sure the tomcat directory is owned by the tomcat user and group.

cd /opt/tomcat/bin/
./shutdown.sh
chown -hR tomcat:tomcat /opt/tomcat/

Shutdown Apache Tomcat server test.

Step 5 – Setup Apache Tomcat Service

In this tutorial, we will run Apache Tomcat as tomcat user with a systemd service file for easy starting and stopping of the service. So the next step is to create a ‘tomcat.service’ file.

Go to the systemd system directory and create a new file ‘tomcat.service’.

cd /etc/systemd/system/
vim tomcat.service

Paste the configuration below:

[Unit]
Description=Apache Tomcat 8 Servlet Container
After=syslog.target network.target
 
[Service]
User=tomcat
Group=tomcat
Type=forking
Environment=CATALINA_PID=/opt/tomcat/tomcat.pid
Environment=CATALINA_HOME=/opt/tomcat
Environment=CATALINA_BASE=/opt/tomcat
ExecStart=/opt/tomcat/bin/startup.sh
ExecStop=/opt/tomcat/bin/shutdown.sh
Restart=on-failure
 
[Install]
WantedBy=multi-user.target

Save the file and exit vim.

Reload the systemd daemon, then start and add the Apache Tomcat service at boot time.

systemctl daemon-reload
systemctl start tomcat
systemctl enable tomcat

Now check that tomcat is running by checking the open port 8080.

netstat -plntu

And check the tomcat status, make sure the service is active.

systemctl status tomcat

Check Tomcat service started with Systemd

Step 6 – Configure Apache Tomcat Users

In this step, we will configure the users for Apache Tomcat. Tomcat is installed, and it’s running by default on port 8080, we can access it with a web browser, but we can not access the site-manager dashboard yet. To enable and configure Tomcat users, edit the file ‘tomcat-users.xml’.

Go to the tomcat configuration directory and edit the tomcat-users.xml file with vim.

cd /opt/tomcat/conf/
vim tomcat-users.xml

Create a new line under line 43 and paste configuration below:

<role rolename="manager-gui"/>
<user username="admin" password="password" roles="manager-gui,admin-gui"/>

Save the file and exit vim.

Next, go to the manager directory and edit the context.xml file.

cd /opt/tomcat/webapps/manager/META-INF/
vim context.xml

Comment out line 19 and 20.

<Context antiResourceLocking=”false” privileged=”true” >
<!–  <Valve className=”org.apache.catalina.valves.RemoteAddrValve”
allow=”127\.\d+\.\d+\.\d+|::1|0:0:0:0:0:0:0:1″ /> –>
</Context>

Save the file and exit vim.

Go to the host-manager directory and edit the context.xml file again.

cd /opt/tomcat/webapps/host-manager/META-INF/
vim context.xml

Comment out again line 19 and 20.

<Context antiResourceLocking=”false” privileged=”true” >
<!–  <Valve className=”org.apache.catalina.valves.RemoteAddrValve”
allow=”127\.\d+\.\d+\.\d+|::1|0:0:0:0:0:0:0:1″ /> –>
</Context>

Save the file and exit, then restart tomcat.

systemctl restart tomcat

Step 7 – Configure Firewalld

In CentOS 7, we have a default firewall tool named firewalld. It replaces the iptables interface and connects to the Netfilter kernel code.

In this step, we will start the firewalld service and open port 8080 so we can access the Apache Tomcat server from the outside of the network.

Start the firewalld service and add it to start at boot time with the systemctl command.

systemctl start firewalld
systemctl enable firewalld

Next, add the apache tomcat port 8080 to the firewall with the firewall-cmd command, and reload the firewalld service.

firewall-cmd –zone=public –permanent –add-port=8080/tcp
firewall-cmd –reload

Check that all the services are available in the firewall and check that the Apache Tomcat port 8080 is open.

firewall-cmd –list-ports
firewall-cmd –list-services

Apache Tomcat port 8080 is accessible from outside of the network, and the ssh port is open by default as well.

Start Apache Tomcat Service with Systemd

Step 8 – Testing

Open your web browser and type in your server IP with port 8080. You will see the Apache Tomcat default page.

http://192.168.1.120:8080

Apache Tomcat Home page

Go to the manager dashboard with URL below:

http://192.168.1.120:8080/manager/html

Type in the admin username ‘admin‘ with password ‘mypassword‘, the configuration that we made on step 5.

Apache Tomcat Manager Dashboard

Now go to the host-manager dashboard with URL below:

http://192.168.1.120:8080/host-manager/html

Enter the admin user and password that you set in step 5, you will see the Tomcat Virtual host Manager.

Apache Tomcat Virtual Host Manager Dashboard

Apache Tomcat 8.5 has been installed on a CentOS 7 Server.

from:https://www.howtoforge.com/tutorial/how-to-install-tomcat-on-centos/

JVM 堆内存和非堆内存

堆和非堆内存

按照官方的说法:“Java 虚拟机具有一个堆(Heap),堆是运行时数据区域,所有类实例和数组的内存均从此处分配。堆是在 Java 虚拟机启动时创建的。”“在JVM中堆之外的内存称为非堆内存(Non-heap memory)”。

JVM主要管理两种类型的内存:堆和非堆。

Heap memory Code Cache
Eden Space
Survivor Space
Tenured Gen
non-heap memory Perm Gen
native heap?(I guess)

堆内存

Java 虚拟机具有一个堆,堆是运行时数据区域,所有类实例和数组的内存均从此处分配。堆是在 Java 虚拟机启动时创建的。对象的堆内存由称为垃圾回收器的自动内存管理系统回收。

堆的大小可以固定,也可以扩大和缩小。堆的内存不需要是连续空间。

非堆内存

Java 虚拟机管理堆之外的内存(称为非堆内存)。

Java 虚拟机具有一个由所有线程共享的方法区。方法区属于非堆内存。它存储每个类结构,如运行时常数池、字段和方法数据,以及方法和构造方法的代码。它是在 Java 虚拟机启动时创建的。

方法区在逻辑上属于堆,但 Java 虚拟机实现可以选择不对其进行回收或压缩。与堆类似,方法区的大小可以固定,也可以扩大和缩小。方法区的内存不需要是连续空间。

除了方法区外,Java 虚拟机实现可能需要用于内部处理或优化的内存,这种内存也是非堆内存。例如,JIT 编译器需要内存来存储从 Java 虚拟机代码转换而来的本机代码,从而获得高性能。

几个基本概念

PermGen space:全称是Permanent Generation space,即永久代。就是说是永久保存的区域,用于存放Class和Meta信息,Class在被Load的时候被放入该区域,GC(Garbage Collection)应该不会对PermGen space进行清理,所以如果你的APP会LOAD很多CLASS的话,就很可能出现PermGen space错误。

Heap space:存放Instance。

Java Heap分为3个区,Young即新生代,Old即老生代和Permanent。

Young保存刚实例化的对象。当该区被填满时,GC会将对象移到Old区。Permanent区则负责保存反射对象。

堆内存分配

  • JVM初始分配的堆内存由-Xms指定,默认是物理内存的1/64;
  • JVM最大分配的堆内存由-Xmx指定,默认是物理内存的1/4。
  • 默认空余堆内存小于40%时,JVM就会增大堆直到-Xmx的最大限制;
  • 空余堆内存大于70%时,JVM会减少堆直到-Xms的最小限制。
  • 因此服务器一般设置-Xms、-Xmx 相等以避免在每次GC 后调整堆的大小。
  • 说明:如果-Xmx 不指定或者指定偏小,应用可能会导致java.lang.OutOfMemory错误,此错误来自JVM,不是Throwable的,无法用try…catch捕捉。

非堆内存分配

  • JVM使用-XX:PermSize设置非堆内存初始值,默认是物理内存的1/64;
  • 由XX:MaxPermSize设置最大非堆内存的大小,默认是物理内存的1/4。
  1. 还有一说:MaxPermSize缺省值和-server -client选项相关,-server选项下默认MaxPermSize为64m,-client选项下默认MaxPermSize为32m。这个我没有实验。
  • XX:MaxPermSize设置过小会导致java.lang.OutOfMemoryError: PermGen space 就是内存益出。
  • 为什么会内存益出:
  1. 这一部分内存用于存放Class和Meta的信息,Class在被 Load的时候被放入PermGen space区域,它和存放Instance的Heap区域不同。
  2. GC(Garbage Collection)不会在主程序运行期对PermGen space进行清理,所以如果你的APP会LOAD很多CLASS 的话,就很可能出现PermGen space错误。
  • 这种错误常见在web服务器对JSP进行pre compile的时候。

JVM内存限制(最大值)

  • 首先JVM内存限制于实际的最大物理内存,假设物理内存无限大的话,JVM内存的最大值跟操作系统有很大的关系。简单的说就32位处理器虽然可控内存空间有4GB,但是具体的操作系统会给一个限制,这个限制一般是2GB-3GB(一般来说Windows系统下为1.5G-2G,Linux系统下为2G-3G),而64bit以上的处理器就不会有限制了。
  • 为什么有的机器我将-Xmx和-XX:MaxPermSize都设置为512M之后Eclipse可以启动,而有些机器无法启动?

通过上面对JVM内存管理的介绍我们已经了解到JVM内存包含两种:堆内存和非堆内存,另外JVM最大内存首先取决于实际的物理内存和操作系统。所以说设置VM参数导致程序无法启动主要有以下几种原因:

  1. 参数中-Xms的值大于-Xmx,或者-XX:PermSize的值大于-XX:MaxPermSize;
  2. -Xmx的值和-XX:MaxPermSize的总和超过了JVM内存的最大限制,比如当前操作系统最大内存限制,或者实际的物理内存等等。说到实际物理内存这里需要说明一点的是,如果你的内存是1024MB,但实际系统中用到的并不可能是1024MB,因为有一部分被硬件占用了。
  • 如果你有一个双核的CPU,也许可以尝试这个参数: -XX:+UseParallelGC 让GC可以更快的执行。(只是JDK 5里对GC新增加的参数)
  • 如果你的WEB APP下都用了大量的第三方jar,其大小超过了服务器jvm默认的大小,那么就会产生内存益出问题了。解决方法: 设置MaxPermSize大小。
  1. 增加服务器启动的JVM参数设置: -Xms128m -Xmx256m -XX:PermSize=128M -XX:MaxNewSize=256m -XX:MaxPermSize=256m
  2. 如tomcat,修改TOMCAT_HOME/bin/catalina.sh,在echo “Using CATALINA_BASE: $CATALINA_BASE”上面加入以下行:JAVA_OPTS=”-server -XX:PermSize=64M -XX:MaxPermSize=128m
  • 建议:将相同的第三方jar文件移置到tomcat/shared/lib目录下,这样可以减少jar 文档重复占用内存

JVM内存设置参数

  • 内存设置参数
设置项 说明
-Xms512m 表示JVM初始分配的堆内存大小为512m(JVM Heap(堆内存)最小尺寸,初始分配)
-Xmx1024m JVM最大允许分配的堆内存大小为1024m,按需分配(JVM Heap(堆内存)最大允许的尺寸,按需分配)
-XX:PermSize=512M JVM初始分配的非堆内存
-XX:MaxPermSize=1024M JVM最大允许分配的非堆内存,按需分配
-XX:NewSize/-XX:MaxNewSize 定义YOUNG段的尺寸,NewSize为JVM启动时YOUNG的内存大小;
MaxNewSize为最大可占用的YOUNG内存大小。
-XX:SurvivorRatio 设置YOUNG代中Survivor空间和Eden空间的比例
  • 说明:
  1. 如果-Xmx不指定或者指定偏小,应用可能会导致java.lang.OutOfMemory错误,此错误来自JVM不是Throwable的,无法用try…catch捕捉。
  2. PermSize和MaxPermSize指明虚拟机为java永久生成对象(Permanate generation)如,class对象、方法对象这些可反射(reflective)对象分配内存限制,这些内存不包括在Heap(堆内存)区之中。
  3. -XX:MaxPermSize分配过小会导致:java.lang.OutOfMemoryError: PermGen space。
  4. MaxPermSize缺省值和-server -client选项相关:-server选项下默认MaxPermSize为64m、-client选项下默认MaxPermSize为32m。
  • 申请一块内存的过程
  1. JVM会试图为相关Java对象在Eden中初始化一块内存区域
  2. 当Eden空间足够时,内存申请结束。否则到下一步
  3. JVM试图释放在Eden中所有不活跃的对象(这属于1或更高级的垃圾回收);释放后若Eden空间仍然不足以放入新对象,则试图将部分Eden中活跃对象放入Survivor区/OLD区
  4. Survivor区被用来作为Eden及OLD的中间交换区域,当OLD区空间足够时,Survivor区的对象会被移到Old区,否则会被保留在Survivor区
  5. 当OLD区空间不够时,JVM会在OLD区进行完全的垃圾收集(0级)
  6. 完全垃圾收集后,若Survivor及OLD区仍然无法存放从Eden复制过来的部分对象,导致JVM无法在Eden区为新对象创建内存区域,则出现”out of memory错误”

resin服务器典型的响应时间优先型的jvm配置:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
-Xmx2000M -Xms2000M -Xmn500M
-XX:PermSize=250M -XX:MaxPermSize=250M
-Xss256K
-XX:+DisableExplicitGC
-XX:SurvivorRatio=1
-XX:+UseConcMarkSweepGC
-XX:+UseParNewGC
-XX:+CMSParallelRemarkEnabled
-XX:+UseCMSCompactAtFullCollection
-XX:CMSFullGCsBeforeCompaction=0
-XX:+CMSClassUnloadingEnabled
-XX:LargePageSizeInBytes=128M
-XX:+UseFastAccessorMethods
-XX:+UseCMSInitiatingOccupancyOnly
-XX:CMSInitiatingOccupancyFraction=60
-XX:SoftRefLRUPolicyMSPerMB=0
-XX:+PrintClassHistogram
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintHeapAtGC
-Xloggc:log/gc.log

内存回收算法

Java中有四种不同的回收算法,对应的启动参数为:

1
2
3
4
–XX:+UseSerialGC
–XX:+UseParallelGC
–XX:+UseParallelOldGC
–XX:+UseConcMarkSweepGC

Serial Collector

大部分平台或者强制 java -client 默认会使用这种。

young generation算法 = serial

old generation算法 = serial (mark-sweep-compact)

这种方法的缺点很明显, stop-the-world, 速度慢。服务器应用不推荐使用。

Parallel Collector

在linux x64上默认是这种,其他平台要加 java -server 参数才会默认选用这种。

young = parallel,多个thread同时copy

old = mark-sweep-compact = 1

优点:新生代回收更快。因为系统大部分时间做的gc都是新生代的,这样提高了throughput(cpu用于非gc时间)

缺点:当运行在8G/16G server上old generation live object太多时候pause time过长

Parallel Compact Collector (ParallelOld)

young = parallel = 2

old = parallel,分成多个独立的单元,如果单元中live object少则回收,多则跳过

优点:old old generation上性能较 parallel 方式有提高

缺点:大部分server系统old generation内存占用会达到60%-80%, 没有那么多理想的单元live object很少方便迅速回收,同时compact方面开销比起parallel并没明显减少。

Concurrent Mark-Sweep(CMS) Collector

young generation = parallel collector = 2

old = cms

同时不做 compact 操作。

优点:pause time会降低, pause敏感但CPU有空闲的场景需要建议使用策略4.

缺点:cpu占用过多,cpu密集型服务器不适合。另外碎片太多,每个object的存储都要通过链表连续跳n个地方,空间浪费问题也会增大。

内存监控方法

  • jmap -heap 查看java 堆(heap)使用情况
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
jmap -heap pid
 
using thread-local object allocation.
 
Parallel GC with 4 thread(s)   #GC 方式
 
Heap Configuration:  #堆内存初始化配置
 
MinHeapFreeRatio=40  #对应jvm启动参数-XX:MinHeapFreeRatio设置JVM堆最小空闲比率(default 40)
MaxHeapFreeRatio=70  #对应jvm启动参数 -XX:MaxHeapFreeRatio设置JVM堆最大空闲比率(default 70)
MaxHeapSize=512.0MB  #对应jvm启动参数-XX:MaxHeapSize=设置JVM堆的最大大小
NewSize  = 1.0MB     #对应jvm启动参数-XX:NewSize=设置JVM堆的‘新生代’的默认大小
MaxNewSize =4095MB   #对应jvm启动参数-XX:MaxNewSize=设置JVM堆的‘新生代’的最大大小
OldSize  = 4.0MB     #对应jvm启动参数-XX:OldSize=<value>:设置JVM堆的‘老生代’的大小
NewRatio  = 8        #对应jvm启动参数-XX:NewRatio=:‘新生代’和‘老生代’的大小比率
SurvivorRatio = 8    #对应jvm启动参数-XX:SurvivorRatio=设置年轻代中Eden区与Survivor区的大小比值
PermSize= 16.0MB     #对应jvm启动参数-XX:PermSize=<value>:设置JVM堆的‘永生代’的初始大小
MaxPermSize=64.0MB   #对应jvm启动参数-XX:MaxPermSize=<value>:设置JVM堆的‘永生代’的最大大小
 
Heap Usage:          #堆内存分步
 
PS Young Generation
 
Eden Space:         #Eden区内存分布
 
capacity = 20381696 (19.4375MB)             #Eden区总容量
used     = 20370032 (19.426376342773438MB)  #Eden区已使用
free     = 11664 (0.0111236572265625MB)     #Eden区剩余容量
99.94277218147106% used                     #Eden区使用比率
 
From Space:        #其中一个Survivor区的内存分布
 
capacity = 8519680 (8.125MB)
used     = 32768 (0.03125MB)
free     = 8486912 (8.09375MB)
0.38461538461538464% used
 
To Space:          #另一个Survivor区的内存分布
 
capacity = 9306112 (8.875MB)
used     = 0 (0.0MB)
free     = 9306112 (8.875MB)
0.0% used
 
PS Old Generation  #当前的Old区内存分布
 
capacity = 366280704 (349.3125MB)
used     = 322179848 (307.25464630126953MB)
free     = 44100856 (42.05785369873047MB)
87.95982001825573% used
 
PS Perm Generation #当前的 “永生代” 内存分布
 
capacity = 32243712 (30.75MB)
used     = 28918584 (27.57891082763672MB)
free     = 3325128 (3.1710891723632812MB)
89.68751488662348% used
  • JVM内存监控工具
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
<%@ page import="java.lang.management.*" %>
<%@ page import="java.util.*" %>
<html>
<head>
  <title>JVM Memory Monitor</title>
</head>
<body>
<table border="0" width="100%">
    <tr><td colspan="2" align="center"><h3>Memory MXBean</h3></td></tr>
    <tr><td width="200">Heap Memory Usage</td><td><%=ManagementFactory.getMemoryMXBean().getHeapMemoryUsage()%></td></tr>
    <tr><td>Non-Heap Memory Usage</td><td><%=ManagementFactory.getMemoryMXBean().getNonHeapMemoryUsage()%></td></tr>
    <tr><td colspan="2"> </td></tr>
    <tr><td colspan="2" align="center"><h3>Memory Pool MXBeans</h3></td></tr>
<%
        Iterator iter = ManagementFactory.getMemoryPoolMXBeans().iterator();
        while (iter.hasNext()) {
            MemoryPoolMXBean item = (MemoryPoolMXBean) iter.next();
%>
<tr><td colspan="2">
    <table border="0" width="100%" style="border: 1px #98AAB1 solid;">
        <tr><td colspan="2" align="center"><b><%= item.getName() %></b></td></tr>
        <tr><td width="200">Type</td><td><%= item.getType() %></td></tr>
        <tr><td>Usage</td><td><%= item.getUsage() %></td></tr>
        <tr><td>Peak Usage</td><td><%= item.getPeakUsage() %></td></tr>
        <tr><td>Collection Usage</td><td><%= item.getCollectionUsage() %></td></tr>
    </table>
</td></tr>
<tr><td colspan="2"> </td></tr>
<%} %>
</table>
</body>
</html>

from:http://www.importnew.com/27645.html