Artificial Neural Network-Adaline & Madaline

-Artificial Neural Network-
Adaline & Madaline
朝陽科技大學
資訊管理系
李麗華教授
Outline
• ADALINE
• MADALINE
• Least-Square Learning Rule
• The proof of Least-Square

Learning Rule
2
朝陽科技大學李麗華教授
Introduction to ADALINE(1/2)
• ADALINE (Adaptive Linear Neuron or
Adaptive Linear Element) is a single
layer neural network.
• It was developed by Professor Bernard
Widrow and his graduate student Ted
Hoff at Stanford University in 1960.
• It is based on the McCulloch–Pitts
neuron. It consists of a weight, a bias
and a summation function.
Reference: http://en.wikipedia.org/wiki/ADALINE
3
Introduction to ADALINE (2/2)
• The difference between Adaline and the standard
(McCulloch-Pitts) perceptron is that in the learning
phase the weights are adjusted according to the
weighted sum of the inputs (the net).
• In the standard perceptron, the net is passed to
the activation (transfer) function and the function's
output is used for adjusting the weights.
• There also exists an extension known as
Madaline.
4
ADALINE (1/3)
• ADALINE: (Adaptive Linear Neuron) 1959 by Bernard
Widrow
X1 w1
PE
w2
X2 ☆ single processing element
．
．
．
wn
Xn
5
ADALINE (2/3)
• Method : The value in each unit must be +1 or –1
(perceptron 是輸出 0 和 1 )
‧net = X iWi
X 0  1  net  W0 X 0  W1 X 1  W2 X 2    Wn X n
1 net  0
 1
Y  if
 1 net  0 0

-1
different from perceptron's transfer function
6
ADALINE (3/3)
Wi   (T-Y) X i , T  expected output

Wi  Wi  Wi
 ADALINE can solve only linear problem(the limitation)
7
Introduction to MADALINE
• Madaline (Multiple Adaline) is using a set of ADALINEs in
parallel as its input layers and a single PE (processing
element) in its output layer. The network of ADALINES can
span many layers.
• For problems with multiple input variables and one output,
each input is applied to one Adaline.
• For similar problems with multiple outputs, Madaline of
parallel process can be used.
• The Madaline network is useful for problems which involve
prediction based on multiple inputs, such as weather
forecasting (Input variables: barometric pressure, difference in
pressure. Output variables: rain, cloudy, sunny).
8
MADALINE
• MADALINE ： It is composed of many ADALINE
（ Multilayer Adaline. ）
x1 Wij netj
．
．
．
Y
．
．
．
．
． no Wij in 2nd layer
．
xn •if more than half of netj ≥ 0, then
output ＋ 1,otherwise, output －
1 majority vote is applied.
•After the second layer, the
9
Least-Square Learning Rule (1/6)
x 0
x1
 X  [ x 0 , x 1 , , x n ] , (i.e. Xj 
t
) 1 k  L
k

字母 k : 代表第 k 組 input xn
pattern
t : 代表向量轉置 (transpose) w0
L : 代表 input pattern 數量 w1
 W  [w 0 ,w 1 , ,w n ] , (i.e. W 
t
)

n
wn
netj = Wt Xj =  wi xi = w0x0 + w1x1 + …+ wn xn
i 0
10
• By applying the least-square learning rule the
weights can be obtained by using the formula.
R: Correlation Matrix
RW*  P
L
R' '
R  , R  R1  R2    RL  X j X j
' ' ' t
W*  R-1P where L j 1
T Xt
Pt  j j
L
11
1  1   1
     
Example ：X1  1  X 2   0  X 3  1
0 1   1
     
T1 1 T2 1 T3 1
X1 X2 X3 Tj
X1 1 1 0 1
X2 1 0 1 1
X
3 李麗華教授
朝陽科技大學
1 1 1 -1 12
• Sol. 先算 R
1  1 1 0 
   
R1  1 110    1
'
1 0 
0 0 0 0  
  
  1 2 2 
1 1 0 1   3 2 2  3 3
    1 
R2   0 101   0 0 0  R   2 2 1    2 2 1 
'
3  3 3 3
1 1 0 1   2 1 2  2 1 2 
       3 3 3
 1 1 1 1 
    
R3  1111  1
'
1 1 
 1 1 1 1 
   
13
P1  1  1,1,0   1,1,0  
t
 t 1 1 
P2  1  1,0,1  1,0,1  
t
 P  1,0,0   ,0,0 
 3 3 
P3  1  1,1,1    1,1,1 
t
1
 1 2 2  W  
 3 3  1   3  3W1  2W2  2W3  1
  W1  3
R W *  P   2 2 1 W2    0   2W1  2W2  W3  0   W2  -2
 3 3 3
    
 2 1 2  3   0  2W1  W2  2W3  0  W3  -2
W
 3 3 3  
 
14
• Verify the net:
X1 3
-2
X2 ADALINE Y
-2
X3 代入（ 1,1,0 ） net=3X1-2X2-2X3=1 Y=1 ok
代入（ 1,0,1 ） net=3X1-2X2-2X3=1 Y=1 ok
代入（ 1,1,1 ） net=3X1-2X2-2X3=-1 Y=-1
ok
  k2   0 this is minimum  best solution

(*) 同學請回家找出反矩陣的快速計算法
15
Proof of Least Square Learning Rule(1/3)
• We use Least Mean Square Error to ensure the minimum
total error. As long as the total error approaches zero, the
best solution is found. Therefore, we are looking for the
minimum of 〈 k 2 〉 .
• Proof:
1 L 2 1 L 1 L 2
mean       k   (Tk  Yk )   (Tk  2Tk Yk  Yk )
2 2 2
L k 1 L k 1 L k 1
2
1 L
2 L
1 L
Yk
  L
L k 1
T k
2

k 1
T Y
k k 
L

k 1
let  Tk2  represents mean

2 L 1 L
 T    Tk Yk   [W ( X k X k )W ]
2 t t
k
L k 1 L k 1
16
1 L n L L
ps :  Yk   ( wi xik ) 2   (W t X k ) 2   (W t X k )( X k W )
2 t
k 1 k 1 i 1 k 1 k 1
L
 W ( X k  X k )  W
t t
k 1
承 2 L t 1
L
上  T    Tk Yk  W [ ( X k X t )]W
k
2
頁 L k 1 L k 1
L
2
 Tk2    Tk (W t X k )  W t  X k X k W
t
L k 1
1 L
 T  2[  (Tk X k )W ]  W t  X k X k W
2 t t
k
L k 1
t t
 T  2Tk X k W  W  X k X k W令此項為 R
k
2 t
17
t
 Let Rk  X k X k , i.e., Rk is a n  n matrix, also called Correlation Matrix.
L
 Let R'  R 1  R 2    R k    R L   Tk X k
t
k 1
 Let R  R' /L (i.e. mean of R ' )  R  X k X kt 

承 2 t
上  Tk  2Tk X k W  W t RW
頁
★ Find W * such that   k2  is minimal
2
 k  2 t
 [Tk   2Tk X k W  W t RW ]'
W
t
 2  Tk X k  2 RW  2 RW  2 P Let P  Tk X k 
2
 k 
if  0  2 RW *－ 2 P = 0 即 W * = R -1 P 或 RW *=P
W
18

Artificial Neural Network-Adaline & Madaline

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Artificial Neural Network-Adaline & Madaline

Uploaded by

Copyright:

Available Formats

-Artificial Neural Network-

Adaline & Madaline

• Least-Square Learning Rule

• The proof of Least-Square

different from perceptron's transfer function

Wi   (T-Y) X i , T  expected output

 ADALINE can solve only linear problem(the limitation)

  k2   0 this is minimum  best solution

let  Tk2  represents mean

 Let R  R' /L (i.e. mean of R ' )  R  X k X kt 

You might also like