In [1]:
import QuickView.DataFrameVisualizer as qv
import pandas as pd
print('loading dataset.csv')
qv.visualize(pd.read_csv('data/train.csv'))
print('Done.....')
loading dataset.csv
Here is a summary of the Dataset, aka Quick View :P ...
Rows count: 532428
Columns count: 45
Number of rows having null value(s): 532428
Numeric Columns: member_id, loan_amnt, funded_amnt, funded_amnt_inv, int_rate, annual_inc, dti, delinq_2yrs, inq_last_6mths, mths_since_last_delinq, mths_since_last_record, open_acc, pub_rec, revol_bal, revol_util, total_acc, total_rec_int, total_rec_late_fee, recoveries, collection_recovery_fee, collections_12_mths_ex_med, mths_since_last_major_derog, acc_now_delinq, tot_coll_amt, tot_cur_bal, total_rev_hi_lim, loan_status
Categorical Columns: term, batch_enrolled, grade, sub_grade, emp_length, home_ownership, verification_status, pymnt_plan, purpose, zip_code, addr_state, initial_list_status, application_type, verification_status_joint, last_week_pay
Text Columns: emp_title, desc, title
Columns with null values...
title : 90
inq_last_6mths : 16
collections_12_mths_ex_med : 95
batch_enrolled : 85149
verification_status_joint : 532123
total_rev_hi_lim : 42004
tot_cur_bal : 42004
open_acc : 16
emp_title : 30830
pub_rec : 16
total_acc : 16
mths_since_last_delinq : 272554
acc_now_delinq : 16
revol_util : 287
delinq_2yrs : 16
tot_coll_amt : 42004
annual_inc : 3
mths_since_last_major_derog : 399448
mths_since_last_record : 450305
desc : 456829
Distinct values in categorical columns...
{Column Name}: term {Values}: 36 months, 60 months
{Column Name}: initial_list_status {Values}: f, w
{Column Name}: verification_status {Values}: Not Verified, Verified, Source Verified
{Column Name}: batch_enrolled {Values}: BAT1135695, BAT4271519, BAT2803411, nan, BAT1586599, BAT2136391, BAT5714674, , BAT5811547, BAT3726927, BAT5629144, BAT2558388, BAT1104812, BAT1766061, BAT4136152, BAT5540558, BAT3594334, BAT4351734, BAT2522922, BAT1184694, BAT5547201, BAT4694572, BAT5924421, BAT2252229, BAT5525466, BAT2833642, BAT2333412, BAT1691418, BAT2881453, BAT3372536, BAT1761981, BAT1327206, BAT3275209, BAT5849876, BAT1914408, BAT5341619, BAT224923, BAT5046385, BAT1467036, BAT3873588, BAT2428731, BAT2078974, BAT2003848, BAT2015867, BAT3865626, BAT1942645, BAT4808022, BAT47674, BAT4201183, BAT4722912, BAT20678, BAT4935307, BAT1930365, BAT5877328, BAT3461431, BAT3839056, BAT3260421, BAT5489674, BAT1780517, BAT2575549, BAT1521494, BAT3518025, BAT1755192, BAT2677031, BAT348786, BAT3193689, BAT3157685, BAT291187, BAT4786748, BAT3975721, BAT4051248, BAT3160077, BAT5614983, BAT1864701, BAT5320519, BAT5597801, BAT5458862, BAT3474907, BAT1273836, BAT4780022, BAT3292317, BAT357701, BAT2881062, BAT3943761, BAT2974007, BAT3147293, BAT447257, BAT3706046, BAT5477261, BAT3537993, BAT4734809, BAT5077496, BAT4939736, BAT2143459, BAT4884699, BAT4260473, BAT4347689, BAT1575727, BAT4726815, BAT2331079, BAT3965509, BAT4160421, BAT4250975, BAT5869156, BAT578944
{Column Name}: verification_status_joint {Values}: nan, Not Verified, Verified, Source Verified
{Column Name}: application_type {Values}: INDIVIDUAL, JOINT
{Column Name}: addr_state {Values}: NY, OH, IL, FL, DE, TX, AR, MD, CA, AZ, VA, GA, NJ, MO, MA, WY, WI, KS, CT, PA, CO, TN, UT, NC, ND, MI, LA, MS, NM, NV, WA, MN, MT, SD, OK, AL, AK, RI, IN, HI, ME, OR, SC, VT, KY, WV, NH, DC, NE, ID, IA
{Column Name}: home_ownership {Values}: OWN, RENT, MORTGAGE, ANY, OTHER, NONE
{Column Name}: sub_grade {Values}: A2, G4, B1, D4, B4, F2, C1, D2, B5, D3, C5, A5, C4, B2, A1, A4, F3, C3, E1, C2, D1, B3, G1, E2, F5, E3, E5, D5, A3, F1, E4, G2, F4, G3, G5
{Column Name}: purpose {Values}: debt_consolidation, credit_card, other, car, medical, house, wedding, home_improvement, small_business, moving, major_purchase, educational, vacation, renewable_energy
{Column Name}: pymnt_plan {Values}: n, y
{Column Name}: emp_length {Values}: 2 years, 10+ years, < 1 year, 4 years, 8 years, 1 year, 7 years, n/a, 6 years, 5 years, 3 years, 9 years
{Column Name}: last_week_pay {Values}: NAth week, 4th week, 0th week, 17th week, 9th week, 8th week, 13th week, 44th week, 18th week, 22th week, 52th week, 26th week, 35th week, 65th week, 39th week, 48th week, 57th week, 43th week, 30th week, 61th week, 70th week, 31th week, 21th week, 78th week, 109th week, 104th week, 92th week, 161th week, 91th week, 74th week, 56th week, 100th week, 83th week, 117th week, 96th week, 82th week, 122th week, 157th week, 156th week, 69th week, 87th week, 113th week, 144th week, 118th week, 139th week, 148th week, 135th week, 152th week, 126th week, 143th week, 108th week, 130th week, 95th week, 182th week, 131th week, 79th week, 165th week, 153th week, 265th week, 121th week, 178th week, 239th week, 226th week, 170th week, 235th week, 196th week, 217th week, 209th week, 169th week, 174th week, 187th week, 252th week, 183th week, 218th week, 256th week, 248th week, 231th week, 261th week, 213th week, 200th week, 192th week, 222th week, 204th week, 230th week, 270th week, 191th week, 244th week, 243th week, 257th week, 221th week, 291th week, 274th week, 205th week, 283th week, 269th week, 304th week, 300th week, 278th week
{Column Name}: grade {Values}: A, G, B, D, F, C, E
{Column Name}: zip_code {Values}: 114xx, 100xx, 432xx, 604xx, 333xx, 198xx, 110xx, 786xx, 770xx, 752xx, 728xx, 216xx, 943xx, 928xx, 115xx, 857xx, 941xx, 220xx, 303xx, 074xx, 926xx, 940xx, 346xx, 777xx, 630xx, 113xx, 208xx, 025xx, 826xx, 906xx, 210xx, 535xx, 856xx, 660xx, 334xx, 061xx, 183xx, 197xx, 801xx, 380xx, 182xx, 847xx, 452xx, 600xx, 920xx, 751xx, 852xx, 275xx, 580xx, 782xx, 300xx, 912xx, 221xx, 481xx, 754xx, 274xx, 328xx, 378xx, 708xx, 373xx, 917xx, 390xx, 088xx, 908xx, 802xx, 015xx, 946xx, 286xx, 019xx, 301xx, 544xx, 871xx, 891xx, 985xx, 211xx, 111xx, 331xx, 898xx, 397xx, 430xx, 930xx, 324xx, 337xx, 445xx, 064xx, 662xx, 932xx, 558xx, 310xx, 302xx, 925xx, 018xx, 021xx, 923xx, 083xx, 765xx, 591xx, 079xx, 640xx, 156xx, 577xx, 606xx, 743xx, 222xx, 140xx, 618xx, 366xx, 207xx, 321xx, 172xx, 820xx, 933xx, 146xx, 986xx, 330xx, 756xx, 270xx, 317xx, 151xx, 530xx, 112xx, 551xx, 130xx, 232xx, 954xx, 750xx, 713xx, 999xx, 633xx, 944xx, 787xx, 554xx, 740xx, 028xx, 178xx, 824xx, 070xx, 775xx, 809xx, 980xx, 181xx, 190xx, 465xx, 395xx, 968xx, 844xx, 044xx, 931xx, 970xx, 338xx, 741xx, 945xx, 472xx, 773xx, 774xx, 308xx, 067xx, 104xx, 890xx, 117xx, 184xx, 155xx, 294xx, 482xx, 125xx, 124xx, 054xx, 352xx, 086xx, 553xx, 136xx, 799xx, 727xx, 953xx, 952xx, 435xx, 171xx, 730xx, 423xx, 483xx, 731xx, 947xx, 322xx, 605xx, 128xx, 109xx, 329xx, 327xx, 382xx, 983xx, 935xx, 602xx, 710xx, 362xx, 212xx, 236xx, 674xx, 939xx, 463xx, 914xx, 657xx, 191xx, 042xx, 287xx, 271xx, 900xx, 347xx, 063xx, 922xx, 047xx, 014xx, 800xx, 913xx, 341xx, 295xx, 456xx, 105xx, 626xx, 277xx, 494xx, 895xx, 225xx, 297xx, 060xx, 280xx, 704xx, 131xx, 260xx, 916xx, 850xx, 440xx, 189xx, 282xx, 087xx, 152xx, 132xx, 956xx, 678xx, 658xx, 078xx, 339xx, 138xx, 907xx, 073xx, 342xx, 402xx, 617xx, 180xx, 776xx, 062xx, 032xx, 309xx, 739xx, 748xx, 368xx, 400xx, 141xx, 439xx, 995xx, 349xx, 587xx, 255xx, 075xx, 509xx, 784xx, 996xx, 107xx, 706xx, 967xx, 974xx, 609xx, 235xx, 780xx, 201xx, 648xx, 080xx, 992xx, 027xx, 654xx, 767xx, 461xx, 123xx, 760xx, 085xx, 973xx, 293xx, 490xx, 462xx, 701xx, 068xx, 066xx, 581xx, 016xx, 153xx, 119xx, 020xx, 950xx, 561xx, 951xx, 471xx, 720xx, 284xx, 254xx, 134xx, 902xx, 385xx, 575xx, 927xx, 150xx, 473xx, 982xx, 840xx, 447xx, 894xx, 238xx, 240xx, 703xx, 588xx, 476xx, 493xx, 863xx, 645xx, 195xx, 805xx, 261xx, 480xx, 448xx, 971xx, 013xx, 434xx, 559xx, 921xx, 214xx, 283xx, 376xx, 989xx, 365xx, 749xx, 882xx, 972xx, 936xx, 875xx, 949xx, 077xx, 853xx, 258xx, 981xx, 281xx, 170xx, 622xx, 082xx, 958xx, 763xx, 336xx, 394xx, 484xx, 557xx, 813xx, 539xx, 320xx, 227xx, 257xx, 446xx, 919xx, 335xx, 755xx, 292xx, 116xx, 468xx, 245xx, 206xx, 601xx, 058xx, 177xx, 565xx, 745xx, 636xx, 644xx, 904xx, 762xx, 370xx, 808xx, 442xx, 781xx, 026xx, 666xx, 291xx, 185xx, 990xx, 631xx, 794xx, 864xx, 655xx, 276xx, 200xx, 386xx, 030xx, 615xx, 023xx, 532xx, 160xx, 431xx, 641xx, 199xx, 665xx, 217xx, 049xx, 296xx, 721xx, 993xx, 218xx, 411xx, 934xx, 278xx, 548xx, 056xx, 305xx, 242xx, 531xx, 234xx, 226xx, 314xx, 348xx, 845xx, 672xx, 298xx, 010xx, 103xx, 313xx, 029xx, 450xx, 607xx, 443xx, 233xx, 403xx, 712xx, 724xx, 547xx, 052xx, 351xx, 757xx, 034xx, 186xx, 144xx, 285xx, 388xx, 071xx, 072xx, 785xx, 319xx, 142xx, 975xx, 168xx, 360xx, 563xx, 323xx, 076xx, 031xx, 880xx, 253xx, 812xx, 957xx, 444xx, 783xx, 735xx, 747xx, 229xx, 350xx, 193xx, 729xx, 560xx, 700xx, 392xx, 629xx, 628xx, 671xx, 307xx, 498xx, 272xx, 393xx, 549xx, 325xx, 855xx, 571xx, 707xx, 984xx, 241xx, 173xx, 188xx, 597xx, 454xx, 387xx, 790xx, 405xx, 611xx, 401xx, 625xx, 381xx, 273xx, 608xx, 279xx, 474xx, 937xx, 711xx, 449xx, 101xx, 230xx, 685xx, 705xx, 851xx, 161xx, 761xx, 495xx, 806xx, 496xx, 129xx, 175xx, 486xx, 634xx, 179xx, 223xx, 372xx, 166xx, 764xx, 492xx, 460xx, 051xx, 012xx, 488xx, 726xx, 410xx, 194xx, 693xx, 841xx, 988xx, 883xx, 344xx, 038xx, 453xx, 209xx, 379xx, 583xx, 441xx, 652xx, 719xx, 299xx, 040xx, 145xx, 620xx, 616xx, 881xx, 722xx, 550xx, 451xx, 121xx, 688xx, 224xx, 976xx, 457xx, 139xx, 065xx, 759xx, 033xx, 256xx, 371xx, 585xx, 691xx, 108xx, 318xx, 391xx, 768xx, 814xx, 316xx, 542xx, 846xx, 670xx, 162xx, 420xx, 357xx, 959xx, 650xx, 069xx, 860xx, 089xx, 244xx, 610xx, 567xx, 924xx, 497xx, 766xx, 960xx, 306xx, 816xx, 637xx, 315xx, 265xx, 237xx, 661xx, 266xx, 081xx, 795xx, 356xx, 374xx, 158xx, 543xx, 037xx, 246xx, 873xx, 469xx, 843xx, 039xx, 798xx, 485xx, 627xx, 120xx, 681xx, 250xx, 538xx, 433xx, 126xx, 815xx, 427xx, 725xx, 545xx, 911xx, 656xx, 017xx, 267xx, 977xx, 910xx, 788xx, 793xx, 359xx, 326xx, 612xx, 122xx, 252xx, 827xx, 614xx, 133xx, 534xx, 231xx, 540xx, 290xx, 570xx, 035xx, 011xx, 363xx, 599xx, 169xx, 564xx, 467xx, 546xx, 288xx, 639xx, 758xx, 811xx, 355xx, 404xx, 147xx, 159xx, 048xx, 377xx, 804xx, 174xx, 489xx, 829xx, 537xx, 361xx, 994xx, 638xx, 024xx, 716xx, 176xx, 955xx, 541xx, 164xx, 903xx, 653xx, 084xx, 148xx, 673xx, 897xx, 127xx, 744xx, 464xx, 877xx, 961xx, 680xx, 196xx, 598xx, 475xx, 304xx, 736xx, 458xx, 753xx, 118xx, 022xx, 163xx, 647xx, 106xx, 562xx, 436xx, 477xx, 167xx, 718xx, 717xx, 396xx, 797xx, 421xx, 668xx, 479xx, 455xx, 723xx, 978xx, 384xx, 437xx, 354xx, 438xx, 791xx, 664xx, 415xx, 746xx, 243xx, 870xx, 478xx, 679xx, 239xx, 096xx, 251xx, 778xx, 675xx, 619xx, 676xx, 998xx, 997xx, 595xx, 566xx, 487xx, 389xx, 624xx, 187xx, 572xx, 149xx, 603xx, 165xx, 398xx, 219xx, 772xx, 262xx, 367xx, 050xx, 383xx, 905xx, 825xx, 143xx, 796xx, 228xx, 135xx, 466xx, 734xx, 422xx, 041xx, 102xx, 803xx, 915xx, 737xx, 684xx, 157xx, 582xx, 948xx, 590xx, 491xx, 358xx, 586xx, 714xx, 264xx, 406xx, 137xx, 613xx, 810xx, 596xx, 053xx, 425xx, 263xx, 865xx, 646xx, 594xx, 499xx, 412xx, 789xx, 556xx, 991xx, 407xx, 424xx, 154xx, 576xx, 677xx, 979xx, 669xx, 769xx, 312xx, 874xx, 667xx, 651xx, 823xx, 687xx, 247xx, 683xx, 043xx, 470xx, 574xx, 962xx, 592xx, 807xx, 364xx, 686xx, 779xx, 215xx, 409xx, 046xx, 249xx, 573xx, 635xx, 690xx, 528xx, 045xx, 918xx, 289xx, 963xx, 332xx, 830xx, 092xx, 268xx, 859xx, 426xx, 091xx, 879xx, 792xx, 408xx, 413xx, 623xx, 878xx, 738xx, 522xx, 689xx, 833xx, 584xx, 893xx, 008xx, 057xx, 259xx, 418xx, 828xx, 416xx, 831xx, 527xx, 369xx, 248xx, 429xx, 822xx, 311xx, 819xx, 094xx, 593xx, 821xx, 500xx, 417xx, 503xx, 090xx, 510xx, 965xx, 036xx, 838xx, 742xx, 836xx, 098xx, 884xx, 702xx, 414xx, 059xx, 097xx, 093xx, 682xx, 269xx, 649xx, 832xx, 523xx, 834xx, 502xx, 514xx, 204xx, 520xx, 837xx, 692xx, 007xx, 340xx, 901xx, 942xx, 513xx, 854xx, 511xx, 353xx, 861xx, 643xx, 835xx, 771xx, 909xx, 507xx, 938xx, 205xx, 569xx, 889xx, 969xx, 343xx, 663xx, 888xx
Min, Max, Mean and std...
max mean min std
member_id 73544841.00 3.500547e+07 70473.00 2.412148e+07
loan_amnt 35000.00 1.475760e+04 500.00 8.434420e+03
funded_amnt 35000.00 1.474427e+04 500.00 8.429139e+03
funded_amnt_inv 35000.00 1.470493e+04 0.00 8.441290e+03
int_rate 28.99 1.324297e+01 5.32 4.379611e+00
annual_inc 9500000.00 7.502984e+04 1200.00 6.519985e+04
dti 672.52 1.813877e+01 0.00 8.369074e+00
delinq_2yrs 30.00 3.144482e-01 0.00 8.600449e-01
inq_last_6mths 31.00 6.946031e-01 0.00 9.970255e-01
mths_since_last_delinq 180.00 3.405573e+01 0.00 2.188480e+01
mths_since_last_record 121.00 7.009307e+01 0.00 2.813922e+01
open_acc 90.00 1.154559e+01 0.00 5.311442e+00
pub_rec 86.00 1.948585e-01 0.00 5.838222e-01
revol_bal 2568995.00 1.692128e+04 0.00 2.242322e+04
revol_util 892.30 5.505719e+01 0.00 2.385344e+01
total_acc 162.00 2.526736e+01 1.00 1.184321e+01
total_rec_int 24205.62 1.753429e+03 0.00 2.093200e+03
total_rec_late_fee 358.68 3.949535e-01 0.00 4.091546e+00
recoveries 33520.27 4.571783e+01 0.00 4.096475e+02
collection_recovery_fee 7002.19 4.859221e+00 0.00 6.312336e+01
collections_12_mths_ex_med 16.00 1.429932e-02 0.00 1.330052e-01
mths_since_last_major_derog 180.00 4.412146e+01 0.00 2.219841e+01
acc_now_delinq 14.00 5.014913e-03 0.00 7.911681e-02
tot_coll_amt 496651.00 2.135622e+02 0.00 1.958572e+03
tot_cur_bal 8000078.00 1.395541e+05 0.00 1.539149e+05
total_rev_hi_lim 9999999.00 3.208057e+04 0.00 3.805304e+04
loan_status 1.00 2.363268e-01 0.00 4.248256e-01
Done.....
In [ ]:
Content source: avannaldas/QuickView
Similar notebooks: