Estimating Lorenz Curve for Iran by Using Continuous L1 Norm Estimation

In this paper, the L1 norm of continuous functions and corresponding continuous estimation of regression parameters are defined. The continuous L1 norm estimation problem of one and two parameters linear models in the continuous case are solved. We proceed to use the functional form and parameters of the probability distribution function of income to exactly determine the L1 norm approximation of the corresponding Lorenz curve of the statistical population under consideration. Iran family budget data were used to estimate income distribution for the period of 1362-1370


Introduction
The skewness of income distribution is persistently exhibited for different populations and at different times. It is discussed that Pearsonian family distributions are rival functions to explain income distribution. Lorenz curve is a method to analyze the skew distributions. There is a relation between the area under the Lorenz curve and the corresponding probability distribution function of the statistical population (see, Kendall and Stuart (1977)). That is, when the probability distribution function is known, we may find the corresponding Gini coefficient as the measure of inequality.
Estimation of the Lorenz curve is confronted with some difficulties. For this estimation, we should define an appropriate functional form which can accept different curvatures (see, Bidabad and Bidabad (1989a,b)). There is another problem, that is, to create the necessary data set for estimating the corresponding parameters of the Lorenz curve, a large amount of computation on raw sample income data is inevitable. Obviously, these problems, despite their computational difficulties, make the significance of the estimated parameters poor (see, Bidabad and Bidabad (1989a,b)). To avoid this, we try to estimate the functional form of the Lorenz curve by using continuous information. In this paper, we use the probability density function of population income to estimate the Lorenz function parameters. The continuous L1 norm smoothing method, which will be developed for estimating the regression parameters is used to solve this problem. However, we concentrate on two rival probability density functions of Pareto and log-normal. Since the former is simply integrable, there is no general problem to derive the corresponding Lorenz function, and the function is uniquely derived. But in the latter case, the log-normal density function (which has better performance for full income range) than Pareto distribution (which better fits to higher income range, (see, Cramer (1973), Singh and Maddala (1976), Salem and Mount (1974)), is not integrable and we can not determine its corresponding Lorenz function. In this regard, we should solve the problem by defining a general Lorenz curve functional form and applying the L1 norm smoothing to estimate the corresponding parameters.
In this paper, continuous L1 norm estimation is developed by using a similar method proposed in Bidabad (1987a,88a,89a,b) for the discrete case. Then the method is applied to the estimation of the Lorenz curve functional forms which have been proposed by Gupta (1984) and Bidabad and Bidabad (1989,92). In the end, we use our formulation to estimate Gini ratio and Kakwani length indices of inequality for the United States for the period of 1971-1990, based on the assumption that income is distributed log-normally.
Before applying this procedure to the Lorenz curve, let us develop the procedure for the two parameters linear model.

Linear two parameters L1 norm continuous smoothing
Now, we try to apply the above technique to the linear two parameters model. Rewrite (4) as, Min: S=||u||1=||y(x)-α-βx||1=∫x ε I |y(x)-α-βx|dx (16) α,β Where, "α" and "β" are two single (non-vector) unknown parameters and y(x) and "x" are as before. According to Rice (1964c), let f(α*,β*,x) interpolates y(x) at the set of canonical points {xi;i=1,2}, if y(x) is such that y(x)-f(α*,β*,x) changes sign at these xi's and at no other points in [0,1], then f(α*,β*,x) is the best L1 norm approximation to y(x) (see also, Usow (1967a)). With the help of this rule, if we denote these two points to t1 and t2 we can rewrite (16) (17) Since t1 and t2 are also unknowns, we should minimize S with respect to α, β, t1 and t2. Taking partial derivative of (17) using Liebniz' rule with respect to these variables and equating them to zero, we will have, Equations (18) through (21) may be solved simultaneously for α, β, t1 and t2. Thus, we have the following system of equations, (29) This procedure, similar to that of multiple regression model for discrete case may be expanded to include "m" unknown parameters which is not discussed here. Some computational methods for solving the different cases of m parameters model are investigated by Ptak (1958), Rice and White (1964), Rice (1964a,b,c,69,85), Usow (1967a), Lazarski (1975a,b,c,77) (see also, Hobby and Rice (1965), Kripke and Rivlin (1965), Watson (1981)). Now, let us have a look at Lorenz curve and its proposed functional forms.

Lorenz curve
The Lorenz curve for a random variable with probability density function f(v) may be defined as the ordered pair 2 , E(V|V≤v) (P(V|V≤v), ──────) vεR (30) E(V) Where "P" and "E" stand for probability and expected value operators. For a continuous density function f(v), (30) can be written as, ⌠+∞l ⌡-∞wf(w)dw We denote (31) by (x(v),y(x(v))) where x(v) and y(x(v)) are its elements. Therefore, "x" is a function which maps "v" to x(v) and "y" is a function which maps x(v) to y(x(v)). The function y(x(v)) is simply the Lorenz curve function. In recent years some functional forms for the Lorenz curve have been introduced. Among different proposed functions, we use the forms of Gupta (1984) and Bidabad and Bidabad (1989,92) which benefits from certain properties (see their articles for more explanations). Gupta (1984) proposed the functional form, y=xA x-1 A>1 (32) Bidabad and Bidabad (1989,92) suggest the following functional form: y=x B A x-1 B≥1, A≥1 (33) To estimate the above functions by regular estimating method, we should gather discrete data from the statistical population, and manipulate them to construct relevant x and y vectors to estimate "A" of (32) or "A" and "B" of (33). If the probability distribution of income is known, instead of gathering discrete observations, we can estimate the Lorenz curve by using the continuous L1 norm smoothing method for continuous functions. In the following section, we proceed to apply this method to estimate the parameters "A" of (32) and "A" and "B" of (33) by using the information of probability density function of income.

Continuous L1 norm smoothing of Lorenz curve
To estimate the Lorenz curve parameters when income probability density function is known, we can not always take straightforward steps. When the probability density function is easily integrable, there is no major problem in advance. We can find the functional relationship between the two elements of (31) by simple mathematical derivation. But, when integrals of (31) are not obtainable, another procedure should be adopted.
In the case of Pareto density function of (34), we can simply derive the Lorenz curve function as follows. Let F(w) denote the Pareto distribution function: F(w)=1-(k/w)θ (36) with mean equal to, E(w)= θ k /(θ-1), θ>1 (37) If we find the function y as stated by (31) as a function of x, the Lorenz function will be derived. Now, proceed as follows. Rearrange the terms of (31) as, ⌠v y(x(v)) = 1-(k/v) θ -1 (42) Now, by solving (40) for "v" and substituting in (42), the Lorenz curve for Pareto distribution is derived as, y = 1-(1-x) ( θ -1)/ θ (43) As it was shown in the case of Pareto distribution, formula of Lorenz curve is easily obtained. But, if we select the log-normal density function (35), the procedure may not be the same. Because the integral of log-normal function has not been derived yet. In the following pages, the L1 norm smoothing technique will be developed to estimate the parameters of given functional forms (32) and (33) by using the continuous probability density function.
The procedure for the model (33) is also similar, with the difference that two values of "v" should be computed. Once two different values of "v" are computed as follow, ⌠v (82) Values of "v" are substituted in (45) to find y(0.07549) and y(0.40442). These values of "y" are used to compute the parameters of the model (33) by substituting them into (78) and (79).
The only problem remains is computation of related definite integrals of x(v) defined by (80), (81) and (82) which can be done by appropriate numerical methods such as the enclosed sample computer program coded for MathCAD 11 for a complete example.

Income distribution in Iran
In order to compute the Lorenz curve for Iran, we try to apply the above procedure for both (32) and (33) propositions and using log-normal distribution function assumption. The source of data is "Statistical Center of Iran" who computed the mean and variance of income for urban and rural families for the period of 1362-1370 (1983-1991) from "Family Budget Surveys" of different years. These data are given in Table 1. The amounts of mean and variance of income were used to derive the log-normal density function parameters µ and σ. The explained procedure of estimation then applied to the series of data of table 1, and corresponding results are reported in Table 2. A sample computer program is also enclosed at the end of these pages.